openclaw

mirror of https://github.com/openclaw/openclaw.git synced 2026-07-28 19:11:19 +00:00

Files

Abdel Gomez-Perez 16b510807b fix(agents/cli-runner): invalidate sessions whose transcript ends mid-tool

A claude-cli session whose JSONL transcript ends with an assistant
`tool_use` content block that was never answered by a `tool_result` user
message cannot resume — claude-cli will sit waiting for the missing
`tool_result`, hit its no-output watchdog, and the runtime kills it
with `reason=abort`. The dispatcher then sees an empty payload and emits
NO_REPLY, which to the user looks like the agent silently ignored their
message — same end-user symptom as the binding-flush amnesia bug, but a
different root cause.

The orphan can be left behind when:
  - Gateway restarts mid-tool (brew upgrade, manual kickstart, OOM,
    crash) — claude was waiting on a tool result that never arrived.
  - `claude-live-session.ts` no-output watchdog fires while a tool is
    actively running and OC kills the subprocess.
  - The tool itself crashed or hung past its own deadline.

In all cases the resumed session is dead until the binding gets cleared,
because every subsequent resume hits the same trailing tool_use and the
same kill cycle. Observed in production on a personal OpenClaw gateway
(3d-engineer agent, 50-message-deep transcript ending in a Bash
`tool_use`; every Telegram message after the orphan landed silently
aborted at the 180s no-output mark).

Add `claudeCliSessionTranscriptHasOrphanedToolUse` to the helpers that
walks the JSONL, finds the last assistant message, and returns true if
any of its `tool_use` ids has no matching `tool_result` later in the
file. Wire into `prepareCliRunContext` as a second invalidator gate
alongside `missing-transcript`. The new `invalidatedReason:
"orphaned-tool-use"` follows the same path as missing-transcript: the
binding is dropped, this turn starts a fresh session, and the prior
context is reseeded into the new session via `RAW_TRANSCRIPT_RESEED`.

Detection only considers TRAILING orphans — an unanswered tool_use
deeper in history is inert because a later assistant message already
moved past it. Only the most recent assistant message's tool_use ids
matter for forward progress.

Probe runs only for claude-cli providers and only when the transcript-
content gate already passed, so we add no I/O on already-invalidated
sessions and no behavior change for non-claude providers.

AI-assisted: yes. Tooling: Claude Opus + claude-cli.

2026-05-30 10:09:19 +05:30

attempt-callbacks.test.ts

…

attempt-callbacks.ts

…

attempt-execution.cli.test.ts

fix(command): stabilize claude-cli transcript resume (#81048 )

2026-05-29 22:56:09 +05:30

attempt-execution.error-propagation.test.ts

…

attempt-execution.helpers.ts

fix(agents/cli-runner): invalidate sessions whose transcript ends mid-tool