Commit Graph

31906 Commits

Author SHA1 Message Date
Ayaan Zaidi
f848a6f7f7 perf(agents): bound claude orphan transcript scan 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
72eff6b2e9 fix(agents): clear orphan tool state on string assistant turns 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
56fc17be78 fix(agents): avoid cli facade load in flush gate 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
3c3e39684e test(agents): cover flushed cli context engine session 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
25dfe9294f fix(agents): pass workspace to cli flush probe 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
622404fcec fix(agents): detect claude-specific orphaned tools 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
bda02f4be8 fix(agents): scope cli binding clears 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
58de6f91dc fix(auto-reply): clear unflushed cli bindings 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
c0a5f15dc8 fix(agents): clear unflushed cli bindings 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
21b5f601b6 fix(agents): preserve auth-boundary cli invalidation 2026-05-30 10:09:19 +05:30
Ayaan Zaidi
2e21158d04 refactor(agents): simplify cli session recovery probes 2026-05-30 10:09:19 +05:30
Abdel Gomez-Perez
16b510807b fix(agents/cli-runner): invalidate sessions whose transcript ends mid-tool
A claude-cli session whose JSONL transcript ends with an assistant
`tool_use` content block that was never answered by a `tool_result` user
message cannot resume — claude-cli will sit waiting for the missing
`tool_result`, hit its no-output watchdog, and the runtime kills it
with `reason=abort`. The dispatcher then sees an empty payload and emits
NO_REPLY, which to the user looks like the agent silently ignored their
message — same end-user symptom as the binding-flush amnesia bug, but a
different root cause.

The orphan can be left behind when:
  - Gateway restarts mid-tool (brew upgrade, manual kickstart, OOM,
    crash) — claude was waiting on a tool result that never arrived.
  - `claude-live-session.ts` no-output watchdog fires while a tool is
    actively running and OC kills the subprocess.
  - The tool itself crashed or hung past its own deadline.

In all cases the resumed session is dead until the binding gets cleared,
because every subsequent resume hits the same trailing tool_use and the
same kill cycle. Observed in production on a personal OpenClaw gateway
(3d-engineer agent, 50-message-deep transcript ending in a Bash
`tool_use`; every Telegram message after the orphan landed silently
aborted at the 180s no-output mark).

Add `claudeCliSessionTranscriptHasOrphanedToolUse` to the helpers that
walks the JSONL, finds the last assistant message, and returns true if
any of its `tool_use` ids has no matching `tool_result` later in the
file. Wire into `prepareCliRunContext` as a second invalidator gate
alongside `missing-transcript`. The new `invalidatedReason:
"orphaned-tool-use"` follows the same path as missing-transcript: the
binding is dropped, this turn starts a fresh session, and the prior
context is reseeded into the new session via `RAW_TRANSCRIPT_RESEED`.

Detection only considers TRAILING orphans — an unanswered tool_use
deeper in history is inert because a later assistant message already
moved past it. Only the most recent assistant message's tool_use ids
matter for forward progress.

Probe runs only for claude-cli providers and only when the transcript-
content gate already passed, so we add no I/O on already-invalidated
sessions and no behavior change for non-claude providers.

AI-assisted: yes. Tooling: Claude Opus + claude-cli.
2026-05-30 10:09:19 +05:30
Abdel Gomez-Perez
07c1245db4 fix(agents/cli-runner): gate cliSessionBinding persist on transcript flush
When a claude-cli turn produces a session id but the underlying claude
subprocess fails to flush an assistant-role record to its
~/.claude/projects/<cwd>/<sid>.jsonl transcript (e.g. mid-turn kill from
a concurrent fingerprint-mismatched turn, supervisor restart, internal
failure), buildCliRunResult was still persisting that session id into
cliSessionBinding. The next turn ran claudeCliSessionTranscriptHasContent,
didn't find the file, logged 'cli session reset: reason=missing-transcript',
and started a brand-new claude session with empty memory.

End-user symptom: agent forgets prior conversation between turns.

Gate the cliSessionBinding spread on the same predicate the next-turn
invalidator uses, evaluated at write time. Also clear agentMeta.sessionId
in the same case so the session-store fallback at command/session-store.ts
(which reads agentMeta.sessionId via setCliSessionId when the binding is
absent) doesn't re-persist the unflushed sid through a different field
path. The fallback is what makes the binding-only gate insufficient on
its own; both writes must drop together.

The gate only fires for claude-cli providers — other CLI providers don't
write to ~/.claude/projects, so probing them would always return false
and incorrectly strip valid binding metadata. isCliBindingFlushed now
takes the provider id and returns true unconditionally for non-claude-cli
sessions.

A bounded retry (0 / 50 / 150 ms) tolerates the brief gap between
claude-cli's stdio close and the OS making the JSONL line visible to
readers (cooperative fsync semantics on APFS, but not guaranteed under
stress).

The transcript-probe is exposed as an injectable dep
(setCliRunnerTestDeps / restoreCliRunnerTestDeps) mirroring the existing
pattern in src/agents/cli-runner/prepare.ts so isCliBindingFlushed is
testable without touching ~/.claude/projects.

AI-assisted: yes. Tooling: Claude Opus + claude-cli. Codex review caught
the fallback path and the missing provider gate before this hit upstream.
Real-Behavior-Proof: dist-side patch on M5 gateway; branch-build
follow-up pending — see PR body.
2026-05-30 10:09:19 +05:30
Ayaan Zaidi
1659b26151 fix(agent): allow media retry after blocked delivery 2026-05-30 09:07:53 +05:30
Ayaan Zaidi
c88178d9b6 fix(agent): recover media completion delivery 2026-05-30 09:07:53 +05:30
Peter Steinberger
d115fb4cf9 refactor: move task state to shared sqlite
Move task run, delivery, and flow registry persistence onto the shared OpenClaw state SQLite database.

Summary:
- Store task runs, delivery state, and flow runs in state/openclaw.sqlite via the generated Kysely schema.
- Migrate shipped task sidecars into the shared state DB and archive old sidecars, including invalid-config/read-only CLI paths.
- Keep startup migration lightweight for read-only status/tasks paths while still detecting known legacy state markers and custom session stores.

Verification:
- .agents/skills/autoreview/scripts/autoreview --mode local: clean after final fix
- pnpm test src/tasks/task-registry.store.test.ts src/tasks/task-flow-registry.store.test.ts src/commands/doctor-state-migrations.test.ts -- --reporter=verbose
- pnpm test src/commands/doctor-state-migrations.test.ts src/cli/program/config-guard.test.ts src/cli/route.test.ts src/cli/command-path-policy.test.ts -- --reporter=verbose
- pnpm test src/cli/program/config-guard.test.ts src/cli/route.test.ts src/cli/command-startup-policy.test.ts src/cli/command-path-policy.test.ts src/cli/command-execution-startup.test.ts -- --reporter=verbose
- pnpm test src/cli/program/config-guard.test.ts src/cli/argv.test.ts src/cli/route.test.ts src/commands/doctor-config-preflight.state-migration.test.ts -- --reporter=verbose
- pnpm test src/tasks/task-flow-registry.store.test.ts -- --reporter=verbose
- pnpm test test/scripts/lint-suppressions.test.ts -- --reporter=verbose
- pnpm db:kysely:check
- pnpm lint:kysely
- git diff --check HEAD
- pnpm test:startup:memory
- PR CI green on 2f7d76f0d5
2026-05-30 04:54:37 +02:00
Josh Avant
584fa3215c Fix restart sentinel internal continuations (#88161)
* fix restart sentinel internal continuations

* update gateway prompt snapshots

* stabilize sandbox browser audit timer tests

* drive sandbox audit timeouts deterministically

* drive gh-read timeout tests deterministically

* drive label-open-issues timeout tests deterministically

* document deterministic timeout test timers

* test: preserve deterministic timer setup after rebase
2026-05-29 19:06:54 -07:00
Vincent Koc
985b41e136 refactor: share Codex auth identity helpers 2026-05-30 03:57:20 +02:00
Vincent Koc
75de853c37 refactor: share provider OAuth runtime helpers 2026-05-30 03:30:51 +02:00
Josh Avant
b3b962a051 fix subagent dm completion delivery (#88182) 2026-05-29 18:24:49 -07:00
Peter Steinberger
acb0e9c155 fix(agents): extend terminal outcome projections (#88162)
* fix(agents): extend terminal outcome projections

* fix(agents): align terminal outcome follow-up checks

* fix(agents): satisfy terminal outcome mapper lint

* test(scripts): isolate websocket open timers

* test(security): drive sandbox browser timeout timers

* test(scripts): drive gh-read timeout timers

* test(agents): isolate code mode timers

* fix(agents): preserve hard timeouts on wait surfaces

* fix(agents): require timeout attribution for provider errors

* fix(sdk): require timeout attribution for provider errors

* fix(scripts): preserve changelog parse cause
2026-05-30 03:13:01 +02:00
Vincent Koc
deb48a96fb refactor: share prompt template arguments 2026-05-30 03:05:46 +02:00
Vincent Koc
1a4eb0b5e7 refactor: share agent truncate utilities 2026-05-30 02:46:45 +02:00
clawsweeper[bot]
18f94fc83a fix(agents): classify embedded provider business denials for fallback (#84814)
Summary:
- The PR classifies selected embedded agent provider-denial error payloads through the shared failover matcher ... 1/current-ak auth matching, preserves guarded non-fallback cases, and covers fallback progression in tests.
- PR surface: Source +34, Tests +166. Total +200 across 5 files.
- Reproducibility: yes. Current main is source-reproducible: a non-GPT embedded result whose only signal is CE ... returns null from the classifier, and the fallback wrapper treats null classification as candidate success.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(agents): classify embedded provider business denials for fallback
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8304…

Validation:
- ClawSweeper review passed for head e266beac93.
- Required merge gates passed before the squash merge.

Prepared head SHA: e266beac93
Review: https://github.com/openclaw/openclaw/pull/84814#issuecomment-4505010446

Co-authored-by: Stellar鱼 <2182712990@qq.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
2026-05-30 00:34:28 +00:00
Peter Steinberger
aada44fca5 fix(agents): preserve Codex auth for compaction fallback
Fixes #86820.

Preserve Codex OAuth-backed compaction by selecting and loading the Codex harness before resolving direct or queued compaction models, while keeping OpenAI-compatible custom base URLs on the OpenAI context config path. Also preserves persisted concrete harness pins so compaction does not hot-switch existing sessions just because an explicit Codex fallback exists.

Verification:
- node scripts/run-vitest.mjs src/agents/embedded-agent-runner/compact.hooks.test.ts src/agents/harness/selection.test.ts src/agents/harness/runtime-plugin.test.ts
- pnpm tsgo:prod
- pnpm check:test-types
- pnpm lint --threads=8
- git diff --check origin/main...HEAD
- git diff --check
- autoreview clean: no accepted/actionable findings reported; overall patch is correct (0.82)
- GitHub PR checks green on ac6f93de4a
2026-05-30 02:26:00 +02:00
Peter Steinberger
43658872d9 test: stabilize sandbox browser audit timers 2026-05-30 01:18:53 +01:00
Merlin
c8a733eae5 fix(gateway): resolve message actions against runtime config (#84535)
* fix(gateway): resolve message action config from runtime snapshot

* fix(gateway): preserve runtime config matching through auto-enable

* fix(gateway): preserve auto-enabled message action fallback

* fix(gateway): use canonical runtime snapshot for message actions

* fix(discord): route credential actions through gateway

---------

Co-authored-by: Merlin <258679497+funmerlin@users.noreply.github.com>
Co-authored-by: joshavant <830519+joshavant@users.noreply.github.com>
2026-05-29 17:14:45 -07:00
Dallin Romney
914f313740 test(unit-fast): isolate fake-timer files (#88160) 2026-05-29 17:11:05 -07:00
Peter Steinberger
4efc48a80d test(ci): stabilize sandbox browser audit timeout 2026-05-30 02:06:58 +02:00
Peter Steinberger
14795dc0cc test: stabilize block reply abort timers 2026-05-30 00:56:15 +01:00
Vincent Koc
c01a0f5588 refactor: share provider oauth runtime helpers 2026-05-30 01:31:10 +02:00
Peter Steinberger
8ff61be8d6 fix(providers): cap local service timers 2026-05-29 19:29:40 -04:00
Peter Steinberger
90d569e896 fix(telegram): centralize positive timer bounds 2026-05-29 19:25:30 -04:00
Peter Steinberger
d8bc71f222 test: stabilize realtime websocket timeout 2026-05-30 00:18:02 +01:00
Peter Steinberger
f3ea2982f5 test(realtime): stabilize websocket timeout test 2026-05-30 01:15:31 +02:00
Vincent Koc
f3f85ae5f7 refactor: share live transport scenario helpers 2026-05-30 01:05:56 +02:00
Dallin Romney
73dd36626c test(infra): avoid max fake-timer jumps (#88155) 2026-05-29 16:02:41 -07:00
Peter Steinberger
b1e5c9d7fa fix(agents): centralize terminal run outcome precedence (#88136)
* fix(agents): centralize terminal run outcome precedence

* docs(agents): explain terminal outcome precedence

* docs(agents): note terminal outcome helper

* fix(agents): preserve pending hard timeout over late completion

* test(agents): align global session scoping expectation

* Revert "test(agents): align global session scoping expectation"

This reverts commit 9b4a0c3cb1b3885299eea7081d97f7142c415dc2.

* test(infra): stabilize CONNECT timeout cap test

* fix(agents): prioritize hard timeout terminal evidence

* fix(gateway): preserve pending hard timeout snapshots
2026-05-30 00:56:20 +02:00
Peter Steinberger
d5e8da8499 fix(ci): repair main normalization checks 2026-05-29 23:53:28 +01:00
keshavbotagent
5f89fbe669 fix(codex): recover app-server completion stalls
Fix Codex app-server completion-stall recovery so replay-safe stdio completion-idle failures retry once, while progress/terminal turn-watch timeouts only surface timeout payloads.

Also preserve post-tool completion guards for scoped native response deltas and stabilize the oversized CONNECT timeout regression test picked up from latest main.

Co-authored-by: Kelaw - Keshav's Agent <keshavbotagent@gmail.com>
2026-05-30 00:52:48 +02:00
Peter Steinberger
bc848b367f refactor: add shared sqlite state database
Adds the shared SQLite state database base, moves plugin keyed state into it with doctor migration coverage, and keeps generated Kysely guardrails aligned. Proof: focused SQLite/plugin-state tests, db:kysely:check, lint:kysely, architecture/dependency guards, autoreview, and PR CI all clean.
2026-05-30 00:52:23 +02:00
Peter Steinberger
ccad5d7b63 fix(web): cap guarded fetch timeout seconds 2026-05-29 18:45:30 -04:00
Peter Steinberger
42b4715124 test(infra): preserve script wrapper fixture 2026-05-30 00:42:41 +02:00
Peter Steinberger
465c4cb580 test(infra): stabilize main CI tests 2026-05-30 00:42:41 +02:00
Peter Steinberger
cb4d2e7bb9 test: stabilize infra state shard 2026-05-29 23:38:31 +01:00
Peter Steinberger
41a92ae445 perf: resolve native esm plugin sdk imports 2026-05-29 23:38:08 +01:00
Peter Steinberger
d7354d61b2 fix(channels): centralize stall watchdog timer bounds 2026-05-29 18:35:37 -04:00
Kevin Lin
c57671176e refactor: share native approval route gates
Share native approval route gate helpers across mainstream channel approval runtimes and keep PR #87770 green on current main.
2026-05-29 15:32:31 -07:00
Peter Steinberger
44e31f7c6a test(gateway): stabilize live helper shard 2026-05-30 00:31:07 +02:00
Peter Steinberger
ed9e9aab3d fix(infra): cap transport readiness timeouts 2026-05-29 18:28:15 -04:00