A claude-cli session whose JSONL transcript ends with an assistant
`tool_use` content block that was never answered by a `tool_result` user
message cannot resume — claude-cli will sit waiting for the missing
`tool_result`, hit its no-output watchdog, and the runtime kills it
with `reason=abort`. The dispatcher then sees an empty payload and emits
NO_REPLY, which to the user looks like the agent silently ignored their
message — same end-user symptom as the binding-flush amnesia bug, but a
different root cause.
The orphan can be left behind when:
- Gateway restarts mid-tool (brew upgrade, manual kickstart, OOM,
crash) — claude was waiting on a tool result that never arrived.
- `claude-live-session.ts` no-output watchdog fires while a tool is
actively running and OC kills the subprocess.
- The tool itself crashed or hung past its own deadline.
In all cases the resumed session is dead until the binding gets cleared,
because every subsequent resume hits the same trailing tool_use and the
same kill cycle. Observed in production on a personal OpenClaw gateway
(3d-engineer agent, 50-message-deep transcript ending in a Bash
`tool_use`; every Telegram message after the orphan landed silently
aborted at the 180s no-output mark).
Add `claudeCliSessionTranscriptHasOrphanedToolUse` to the helpers that
walks the JSONL, finds the last assistant message, and returns true if
any of its `tool_use` ids has no matching `tool_result` later in the
file. Wire into `prepareCliRunContext` as a second invalidator gate
alongside `missing-transcript`. The new `invalidatedReason:
"orphaned-tool-use"` follows the same path as missing-transcript: the
binding is dropped, this turn starts a fresh session, and the prior
context is reseeded into the new session via `RAW_TRANSCRIPT_RESEED`.
Detection only considers TRAILING orphans — an unanswered tool_use
deeper in history is inert because a later assistant message already
moved past it. Only the most recent assistant message's tool_use ids
matter for forward progress.
Probe runs only for claude-cli providers and only when the transcript-
content gate already passed, so we add no I/O on already-invalidated
sessions and no behavior change for non-claude providers.
AI-assisted: yes. Tooling: Claude Opus + claude-cli.
When a claude-cli turn produces a session id but the underlying claude
subprocess fails to flush an assistant-role record to its
~/.claude/projects/<cwd>/<sid>.jsonl transcript (e.g. mid-turn kill from
a concurrent fingerprint-mismatched turn, supervisor restart, internal
failure), buildCliRunResult was still persisting that session id into
cliSessionBinding. The next turn ran claudeCliSessionTranscriptHasContent,
didn't find the file, logged 'cli session reset: reason=missing-transcript',
and started a brand-new claude session with empty memory.
End-user symptom: agent forgets prior conversation between turns.
Gate the cliSessionBinding spread on the same predicate the next-turn
invalidator uses, evaluated at write time. Also clear agentMeta.sessionId
in the same case so the session-store fallback at command/session-store.ts
(which reads agentMeta.sessionId via setCliSessionId when the binding is
absent) doesn't re-persist the unflushed sid through a different field
path. The fallback is what makes the binding-only gate insufficient on
its own; both writes must drop together.
The gate only fires for claude-cli providers — other CLI providers don't
write to ~/.claude/projects, so probing them would always return false
and incorrectly strip valid binding metadata. isCliBindingFlushed now
takes the provider id and returns true unconditionally for non-claude-cli
sessions.
A bounded retry (0 / 50 / 150 ms) tolerates the brief gap between
claude-cli's stdio close and the OS making the JSONL line visible to
readers (cooperative fsync semantics on APFS, but not guaranteed under
stress).
The transcript-probe is exposed as an injectable dep
(setCliRunnerTestDeps / restoreCliRunnerTestDeps) mirroring the existing
pattern in src/agents/cli-runner/prepare.ts so isCliBindingFlushed is
testable without touching ~/.claude/projects.
AI-assisted: yes. Tooling: Claude Opus + claude-cli. Codex review caught
the fallback path and the missing provider gate before this hit upstream.
Real-Behavior-Proof: dist-side patch on M5 gateway; branch-build
follow-up pending — see PR body.
Move task run, delivery, and flow registry persistence onto the shared OpenClaw state SQLite database.
Summary:
- Store task runs, delivery state, and flow runs in state/openclaw.sqlite via the generated Kysely schema.
- Migrate shipped task sidecars into the shared state DB and archive old sidecars, including invalid-config/read-only CLI paths.
- Keep startup migration lightweight for read-only status/tasks paths while still detecting known legacy state markers and custom session stores.
Verification:
- .agents/skills/autoreview/scripts/autoreview --mode local: clean after final fix
- pnpm test src/tasks/task-registry.store.test.ts src/tasks/task-flow-registry.store.test.ts src/commands/doctor-state-migrations.test.ts -- --reporter=verbose
- pnpm test src/commands/doctor-state-migrations.test.ts src/cli/program/config-guard.test.ts src/cli/route.test.ts src/cli/command-path-policy.test.ts -- --reporter=verbose
- pnpm test src/cli/program/config-guard.test.ts src/cli/route.test.ts src/cli/command-startup-policy.test.ts src/cli/command-path-policy.test.ts src/cli/command-execution-startup.test.ts -- --reporter=verbose
- pnpm test src/cli/program/config-guard.test.ts src/cli/argv.test.ts src/cli/route.test.ts src/commands/doctor-config-preflight.state-migration.test.ts -- --reporter=verbose
- pnpm test src/tasks/task-flow-registry.store.test.ts -- --reporter=verbose
- pnpm test test/scripts/lint-suppressions.test.ts -- --reporter=verbose
- pnpm db:kysely:check
- pnpm lint:kysely
- git diff --check HEAD
- pnpm test:startup:memory
- PR CI green on 2f7d76f0d5
Summary:
- The PR classifies selected embedded agent provider-denial error payloads through the shared failover matcher ... 1/current-ak auth matching, preserves guarded non-fallback cases, and covers fallback progression in tests.
- PR surface: Source +34, Tests +166. Total +200 across 5 files.
- Reproducibility: yes. Current main is source-reproducible: a non-GPT embedded result whose only signal is CE ... returns null from the classifier, and the fallback wrapper treats null classification as candidate success.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(agents): classify embedded provider business denials for fallback
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8304…
Validation:
- ClawSweeper review passed for head e266beac93.
- Required merge gates passed before the squash merge.
Prepared head SHA: e266beac93
Review: https://github.com/openclaw/openclaw/pull/84814#issuecomment-4505010446
Co-authored-by: Stellar鱼 <2182712990@qq.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>
Fixes#86820.
Preserve Codex OAuth-backed compaction by selecting and loading the Codex harness before resolving direct or queued compaction models, while keeping OpenAI-compatible custom base URLs on the OpenAI context config path. Also preserves persisted concrete harness pins so compaction does not hot-switch existing sessions just because an explicit Codex fallback exists.
Verification:
- node scripts/run-vitest.mjs src/agents/embedded-agent-runner/compact.hooks.test.ts src/agents/harness/selection.test.ts src/agents/harness/runtime-plugin.test.ts
- pnpm tsgo:prod
- pnpm check:test-types
- pnpm lint --threads=8
- git diff --check origin/main...HEAD
- git diff --check
- autoreview clean: no accepted/actionable findings reported; overall patch is correct (0.82)
- GitHub PR checks green on ac6f93de4a
Fix Codex app-server completion-stall recovery so replay-safe stdio completion-idle failures retry once, while progress/terminal turn-watch timeouts only surface timeout payloads.
Also preserve post-tool completion guards for scoped native response deltas and stabilize the oversized CONNECT timeout regression test picked up from latest main.
Co-authored-by: Kelaw - Keshav's Agent <keshavbotagent@gmail.com>
Adds the shared SQLite state database base, moves plugin keyed state into it with doctor migration coverage, and keeps generated Kysely guardrails aligned. Proof: focused SQLite/plugin-state tests, db:kysely:check, lint:kysely, architecture/dependency guards, autoreview, and PR CI all clean.