Commit Graph

464 Commits

Author SHA1 Message Date
Ayaan Zaidi
17066f2d7c fix(cron): preserve default toolsAllow markers safely 2026-06-24 06:26:52 -07:00
Cameron Beeley
9aea104cc8 fix(cron): stop stamping an unenforceable default toolsAllow cap on CLI runs
#91499 auto-stamps the creator's tool surface as a default toolsAllow cap
on agentTurn cron payloads whenever the creating session is tool-restricted
(a narrowing allow-policy or an explicit deny). CLI backends cannot enforce
a runtime toolsAllow — cli-runner/prepare.ts rejects any defined allow-list
— so every scheduled agentTurn that resolves to a CLI backend (e.g.
claude-cli) fails to start. This silently broke per-thread scheduled
continuations on CLI backends.

A CLI backend is not a runtime tool-policy boundary: it runs with its own
configured tool set, as the operator, on the local machine, and refuses a
runtime allow-list outright. An inherited default cap is therefore
unenforceable on a CLI backend. Decide at run time, where the backend is
known:

- Flag the default. capCronAgentTurnToolsAllow stamps toolsAllowIsDefault
  when it fills in the creator surface because the cron requested nothing
  (or a bare "*"). An explicit narrowing or empty allow-list is a real
  per-cron restriction and carries no flag.
- Drop only the default, only on CLI. The run-executor drops a flagged
  default in the CLI branch and lets the run proceed. An explicit per-cron
  restriction (no flag) is deliberately passed through, so prepare.ts still
  fails it closed and surfaces that the requested policy needs an embedded
  runtime. Embedded runs are untouched and keep the full cap enforced.
- Persist the flag. New nullable cron_jobs.payload_tools_allow_is_default
  column (additive ensureColumn migration + codec read/write) so the
  decision survives a gateway restart, plus toolsAllowIsDefault on the
  gateway-protocol agentTurn payload schema — the stamped payload is
  otherwise rejected by the contract's additionalProperties:false.
- Preserve the flag across updates. A no-toolsAllow update (reschedule,
  prompt edit) no longer carries the stored default forward as a literal
  value — that routed it through the explicit-narrowing branch, stripped the
  flag, and re-broke the job on CLI after the next restart. The default is
  re-derived (flag intact); an explicit restriction is still carried forward
  unflagged.

Net policy: on CLI only the unenforceable inherited default is relaxed;
explicit per-cron restrictions still fail closed; embedded backends are
unchanged.

Tests: run-executor drops the flagged default but propagates an explicit
restriction on CLI; cron-tool stamps/clears the flag across create and
update and preserves it across a no-toolsAllow update; store round-trips the
flag (and its absence) through SQLite.

Not covered: agentTurn crons created during the regression window carry a
flagless toolsAllow and remain fail-closed on CLI until recreated or updated
with an explicit toolsAllow.
2026-06-24 06:26:52 -07:00
mushuiyu886
414c250af9 fix #95495: [Bug]: 2026.6.9 silently relocates memory store with no migration, forcing a full re-embed (1499 files) with zero upgrade-time warning (#95631)
* fix(memory): import legacy sidecar indexes into agent db

* fix(memory): move legacy sidecar import to doctor migration

* fix(memory): restore sidecar vector rows during doctor migration

* fix(memory): keep legacy sidecar when skipping import

* fix(memory): keep legacy sidecar import within extension boundary

* fix(memory-core): keep legacy sidecar migration retry-safe

* fix(memory-core): backfill sidecar FTS rows

* fix(memory-core): preserve sidecar when vector import defers

* fix(memory-core): cover custom sidecar migrations

* fix(memory-core): keep legacy config migration under doctor

* fix(memory-core): reject sidecar metadata conflicts

* fix(memory-core): keep partial legacy config sidecars

* fix(memory-core): preserve partial config retries

* fix(memory-core): keep partial config task migrations

* fix(memory-core): avoid phantom sidecar agents

* fix(memory-core): reject incomplete sidecar indexes

* fix(memory-core): keep malformed sidecars retryable

* fix(doctor): use canonical state dir for plugin migrations

* fix(memory-core): honor disabled vector sidecar migration

* fix(memory-core): treat provider-none sidecars as fts-only

* fix(memory-core): preserve setup-failed sidecars

* test(memory-core): use non-mutating sort assertions

* test(memory-core): compare sorted chunk ids

* test(memory-core): compare sorted chunk ids

* test(memory-core): stringify sorted chunk ids

* fix(qa): skip chromium bootstrap for explicit browser channels

* fix(qa): skip chromium bootstrap for explicit browser channels

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-06-24 17:47:44 +08:00
ly-wang19
599294b9af fix(acp-core): never return undefined from stringifyNonErrorCause (#96270)
`stringifyNonErrorCause` is typed `string`, but its `try` returned
`JSON.stringify(value)`, which is `undefined` for functions, symbols, and
undefined causes — leaking undefined to callers that format nested ACP runtime
failures and expect a string. Fall back to a tag string when stringify yields
undefined, matching the already-correct sibling at `src/infra/errors.ts`.

Co-authored-by: ly-wang19 <ly-wang19@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 16:06:45 +08:00
Peter Steinberger
c9ddf2eca6 test(memory): clean up qmd fixture gracefully 2026-06-23 13:31:46 -07:00
Peter Steinberger
73dd758310 fix(memory): abort orphaned qmd search processes 2026-06-23 13:31:46 -07:00
Alix-007
78184ea7e4 fix(memory): abort orphaned qmd search subprocess when memory_search times out
PR #91742 wired memory_search's 15s deadline AbortSignal through the builtin
memory manager but missed the QMD backend behind the same
MemorySearchManager.search interface. With QMD, the tool returns "timed out
after 15s" to the agent while the spawned qmd query/search subprocess keeps
running for the full qmd command timeout (memory.qmd.limits.timeoutMs, whose
embed-heavy default was raised to 600s in #87572), leaving orphaned
embedding/search work running after the agent already moved on.

Add optional AbortSignal support to runCliCommand: an aborting signal kills the
spawned child immediately and rejects with the abort reason, funneled through a
single settle() guard so abort/timeout/error/close cannot double-settle. Thread
the search signal through QmdMemoryManager.search -> runQmdSearch -> runQmd ->
runCliCommand for the default direct-qmd subprocess path (including the query
fallback), and fast-fail search() when the signal is already aborted.
2026-06-23 13:31:46 -07:00
Josh Lehman
c24d266b2d refactor: use accessor-backed transcript corpus for memory (#96162)
* refactor: ratchet memory transcript corpus access

* test: use narrow runtime config snapshot import

* test: update plugin sdk surface budgets

* refactor: split memory transcript corpus module
2026-06-23 12:37:44 -07:00
Yuval Dinodia
f826a665a2 fix(compaction): trim prefix when transcript ends in an oversized tool result (#95860)
findCutPoint defaulted cutIndex to the earliest valid cut (cutPoints[0],
keep everything) and only moved it forward to a cut point at or after the
backward token cursor. When the final entry is a toolResult whose estimate
alone meets keepRecentTokens, the cursor stops at that trailing toolResult
index, no valid cut point sits at or after it (toolResult entries are not
valid cut points), and the default stuck at keep-everything. Compaction then
summarized zero messages, so preflight and overflow compaction silently
no-op and the session loops on a context it cannot shrink.

Default cutIndex to the most recent valid cut before the forward search.
When a cut point exists at or after the cursor the search still finds it and
behavior is unchanged; only the trailing-tool-result case now keeps the
recent tail and summarizes the prefix.
2026-06-23 07:34:33 +00:00
Vincent Koc
abd8a46b0a improve: reduce hot-path linear scans and redundant I/O (#95697)
Merged via squash.

Prepared head SHA: 67f2678a34
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-23 10:11:18 +08:00
zhang-guiping
2dc2d73b07 fix(webchat): sessions persist after reconnects (#89017)
* fix(gateway): preserve asserted webchat sessions

* test(gateway): cover stale asserted webchat sessions

* fix(gateway): scope webchat session resume

* chore(protocol): refresh chat send models

* fix: document reconnect session resume protocol

* fix(gateway): keep reconnect resume internal

* gateway: keep reconnect resume options internal

* test(ui): avoid private resume marker lint access
2026-06-22 20:02:58 +00:00
Parvesh Saini
e33760c9df fix(model-catalog): strip manifest model-id prefixes by the matched length (#95744) 2026-06-22 19:52:13 +00:00
Yzx
1662b07810 fix(cron): expose per-job fallbacks in CLI (#93369) 2026-06-22 19:22:20 +00:00
Amer Sheeny
b8434386b8 fix(acp): recover stale persistent sessions by structured resume-required code (#93547)
Persistent ACP threads died on the second turn for Kiro: when the backend
can no longer resume a stale session, acpx raises a SessionResumeRequiredError
whose reason text varies by backend ("Resource not found" for Claude,
"Internal error" / RequestError -32603 for Kiro). The recovery gate matched
the human reason text and required "resource not found", so Kiro's "Internal
error" never triggered the fresh-session retry and the thread produced no
reply (ACP_TURN_FAILED).

Recover by acpx's structured detail code instead of the reason text: acpx
tags every such failure with detailCode "SESSION_RESUME_REQUIRED"
(retryable), independent of wording. The two AcpRuntimeError construction
seams were discarding detailCode, so preserve it on AcpRuntimeError and match
it across the error and its cause chain. This fixes every backend's
resume-required failure and is more precise than the reason regex — a generic
"Internal error" without the code is still surfaced rather than silently
retried.

Fixes #87830. Reported by @chouzz.
2026-06-22 18:08:56 +00:00
Ben.Li
b335381247 fix(memory): preserve Windows QMD command paths (#95274) 2026-06-22 17:50:11 +00:00
David
3ff0c29f9d fix: handle terminal chat send acknowledgements (#91049)
* test: cover terminal chat send acknowledgements

* test: cover Swift terminal chat send acknowledgement

* fix: handle terminal chat send acknowledgements

* fix: align terminal ack web lifecycle options

* test: fix Android terminal ack style

* fix: tidy Android terminal ack helpers

* fix: clear mic pending run after terminal ack

* fix: handle terminal talk mode chat send acks

* fix: handle terminal tui chat send acks

* fix: handle terminal acp chat send acks

* test: add Swift chat message text helper

* test: cover steer terminal chat send acknowledgements

* fix: handle terminal steer chat send acks

* test: cover terminal realtime consult send acks

* fix: reject terminal realtime consult send acks

* test: cover Swift terminal ok chat send ack

* fix: clear Swift pending run on terminal ok ack

* test: cover terminal ack helper callers

* fix: preserve terminal ack helper semantics

* fix: narrow terminal ack type guard

* test: cover mic terminal ack statuses

* fix: preserve mic terminal ack status

* fix: keep mic ack contract internal

* test: fix mic ack import order

* test: cover acp terminal ok ack

* test: narrow acp ok ack assertion

* test: cover redirect terminal acknowledgements

* fix: handle redirect terminal acknowledgements

* fix: settle terminal ack reconnect prompts

* fix: surface Android terminal ack timeouts

* fix(tui): handle detached terminal chat acknowledgements

* fix(tui): report terminal timeout send failures

* fix: satisfy iOS talk-mode SwiftFormat

* fix: keep iOS talk logs compile-safe
2026-06-22 17:27:54 +00:00
ly-wang19
9a54e5b292 fix(sdk): classify failed/blocked tool events as tool.call.failed (#95383)
normalizeAgentEventType checked the `phase:"end" || status==="completed"`
branch before the `failed/blocked` branch, but terminal tool/item events are
emitted with phase:"end" AND the real status, so failed and blocked tools were
normalized to tool.call.completed and the tool.call.failed branch was dead for
the item stream. SDK consumers filtering on tool.call.failed never saw tool
failures (they looked like successes). Reorder so failed/blocked is classified
before end/completed.

Co-authored-by: ly-wang19 <ly-wang19@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 16:54:14 +00:00
Vincent Koc
b3b5b08e67 fix(memory): preserve Windows session transcript paths 2026-06-22 23:32:06 +08:00
Josh Lehman
d3781cc4b8 refactor: add memory and QMD session identity mapping (#95087) 2026-06-22 06:28:54 -07:00
Vincent Koc
482e6cb5cb fix(codeql): clean OpenClaw quality findings 2026-06-22 19:11:46 +08:00
Vincent Koc
c6d9977902 test(sdk): resolve npm runner in package e2e 2026-06-22 18:28:27 +08:00
teamclaw
7fe287b0d3 fix(agent-core): stop loop after aborted tool run (#94412)
Merged via squash.

Prepared head SHA: e11d9718e3
Co-authored-by: szsip239 <88223778+szsip239@users.noreply.github.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Reviewed-by: @vincentkoc
2026-06-22 18:04:50 +08:00
wangmiao0668000666
4db829646a fix(sdk): type-narrow manifest.files in pack staging root helper (#95465) 2026-06-22 13:52:51 +08:00
Yuval Dinodia
2ece2945ae fix(compaction): count user-message image blocks in cut-point estimator (#95128)
estimateTokens charged 4800 chars per image in the toolResult branch but
counted only text in the user branch, so image blocks in recent user turns
scored zero. findCutPoint never reached keepRecentTokens and left the cut at
the earliest point, so image-heavy sessions compacted to a no-op and looped on
context overflow. Fold the per-image accounting into one shared helper used by
both branches.
2026-06-22 13:48:02 +08:00
Vincent Koc
93ad397725 fix: preserve normalization and ACP fast mode contracts 2026-06-22 09:37:10 +08:00
Vincent Koc
2b75806197 feat: forward-port fast talks auto mode (#85104) 2026-06-22 09:37:09 +08:00
Vincent Koc
328a44695f chore(deadcode): remove unused agent-core prompt formatter 2026-06-22 06:23:07 +08:00
Vincent Koc
2bdcc8314d fix(sdk): preserve zero run timeout watchdog 2026-06-22 00:13:51 +02:00
Vincent Koc
464adfe5e5 chore(deadcode): remove unused agent-core harness APIs 2026-06-22 06:08:32 +08:00
Vincent Koc
9adf3d92bd chore(deadcode): remove unused helper paths 2026-06-22 03:39:19 +08:00
Josh Avant
5d1e649aea fix: route mobile exec approvals to reviewer device (#95175)
* fix: route mobile exec approvals to reviewer device

* fix: surface iOS approval events in foreground

* fix: forward codex approval reviewer device

* test: harden approval reviewer device contract

* test: cover reviewer approval fallback resolvers
2026-06-21 08:47:52 -05:00
Vincent Koc
b796890b97 test(sdk): resolve Windows package taskkill path 2026-06-21 12:52:41 +02:00
Vincent Koc
830691b201 fix(memory-host-sdk): taskkill qmd process trees on windows 2026-06-21 08:51:36 +02:00
Vincent Koc
06574920dd fix(sdk): taskkill package e2e trees on windows 2026-06-21 08:33:36 +02:00
scotthuang
81abc2b21b fix: preserve cron delivery awareness for target sessions (#93580)
Merged via squash.

Prepared head SHA: 460562ceff
Co-authored-by: scotthuang <1670837+scotthuang@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-06-20 12:13:10 -07:00
Vincent Koc
7c850bdf38 fix(test): kill SDK package command trees 2026-06-20 17:54:16 +02:00
Vincent Koc
2bc20f2ec5 fix(test): use pnpm runner for SDK package build 2026-06-20 17:51:21 +02:00
Vincent Koc
7d658dfd97 fix(sdk): honor session send timeouts 2026-06-20 06:18:09 +02:00
Vincent Koc
e89c255a01 fix(sdk): require session key for effective tools 2026-06-20 06:00:03 +02:00
Vincent Koc
a635e97965 fix(sdk): tighten approval response params 2026-06-20 05:59:50 +02:00
Vincent Koc
4f278ef71c fix(sdk): type agent mutation RPC params 2026-06-20 05:59:36 +02:00
Vincent Koc
d76c1daa52 fix(sdk): list helpers work without filters
SDK list helpers now send an empty params object when filters are omitted while preserving explicit invalid params for Gateway validation.\n\nVerification:\n- git diff --check origin/main...HEAD\n- node --check packages/sdk/src/client.ts\n- codex review --base origin/main\n- GitHub Actions CI release gate 27855603923 succeeded on 353f13c0d1
2026-06-20 09:22:48 +08:00
Vincent Koc
c2e26db61b fix(sdk): send exec approval resolve id (#95144) 2026-06-20 08:52:55 +08:00
Vincent Koc
c4d1f37d33 fix(memory): abort batch upload response reads (#95111)
* fix(memory): abort batch upload response reads

* test(memory): stabilize batch upload abort proof
2026-06-20 06:22:23 +08:00
Vincent Koc
6f5fdb1e6b fix(gateway): validate plugin descriptors and compact refresh 2026-06-19 22:25:15 +02:00
Andrew Stroup
378c4134f1 fix(slack): default member-info userId to inbound sender (#89236)
Merged via squash.

Prepared head SHA: c7a39e54f7
Co-authored-by: stroupaloop <2424551+stroupaloop@users.noreply.github.com>
Co-authored-by: steipete <58493+steipete@users.noreply.github.com>
Reviewed-by: @steipete
2026-06-19 14:03:29 +01:00
Vincent Koc
3bc936b675 test(sdk): keep package e2e pnpm noninteractive 2026-06-19 13:34:04 +02:00
Peter Lee
6256ad86c9 fix(gateway): classify probe reachability by validated transport (#93948)
Distinguish validated gateway reachability from pre-open and TLS-validation failures, and sanitize close diagnostics before terminal output.

Fixes #79099.

Co-authored-by: xialonglee <li.xialong@xydigit.com>
2026-06-19 11:56:16 +01:00
Vincent Koc
6aa85dfaa1 refactor(memory): drop unused host-sdk helpers 2026-06-19 16:04:00 +08:00
Vincent Koc
33fa225f65 refactor(memory): drop unused host helpers 2026-06-19 15:13:27 +08:00