Files
openclaw/docs/reference/transcript-hygiene.md
Peter Steinberger bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341)
* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types
2026-05-27 19:24:04 +01:00

10 KiB

summary, read_when, title
summary read_when title
Reference: provider-specific transcript sanitization and repair rules
You are debugging provider request rejections tied to transcript shape
You are changing transcript sanitization or tool-call repair logic
You are investigating tool-call id mismatches across providers
Transcript hygiene

OpenClaw applies provider-specific fixes to transcripts before a run (building model context). Most of these are in-memory adjustments used to satisfy strict provider requirements. A separate session-file repair pass may also rewrite stored JSONL before the session is loaded, but only for malformed lines or persisted turns that are invalid durable records. Delivered assistant replies are preserved on disk; provider-specific assistant-prefill stripping happens only while constructing outbound payloads. When a repair occurs, the original file is written to a transient *.bak-<pid>-<ts> sibling before the atomic replace and removed once the replace succeeds; the backup is only retained if cleanup itself fails (in which case the path is reported back).

Scope includes:

  • Runtime-only prompt context staying out of user-visible transcript turns
  • Tool call id sanitization
  • Tool call input validation
  • Tool result pairing repair
  • Turn validation / ordering
  • Thought signature cleanup
  • Thinking signature cleanup
  • Image payload sanitization
  • Blank text-block cleanup before provider replay
  • User-input provenance tagging (for inter-session routed prompts)
  • Empty assistant error-turn repair for Bedrock Converse replay

If you need transcript storage details, see:


Global rule: runtime context is not user transcript

Runtime/system context can be added to the model prompt for a turn, but it is not end-user-authored content. OpenClaw keeps a separate transcript-facing prompt body for Gateway replies, queued followups, ACP, CLI, and embedded OpenClaw runs. Stored visible user turns use that transcript body instead of the runtime-enriched prompt.

For legacy sessions that already persisted runtime wrappers, Gateway history surfaces apply a display projection before returning messages to WebChat, TUI, REST, or SSE clients.


Where this runs

All transcript hygiene is centralized in the embedded runner:

  • Policy selection: src/agents/transcript-policy.ts
  • Sanitization/repair application: sanitizeSessionHistory in src/agents/embedded-agent-runner/replay-history.ts

The policy uses provider, modelApi, and modelId to decide what to apply.

Separate from transcript hygiene, session files are repaired (if needed) before load:

  • repairSessionFileIfNeeded in src/agents/session-file-repair.ts
  • Called from run/attempt.ts and compact.ts (embedded runner)

Global rule: image sanitization

Image payloads are always sanitized to prevent provider-side rejection due to size limits (downscale/recompress oversized base64 images).

This also helps control image-driven token pressure for vision-capable models. Lower max dimensions generally reduce token usage; higher dimensions preserve detail.

Implementation:

  • sanitizeSessionMessagesImages in src/agents/embedded-agent-helpers/images.ts
  • sanitizeContentBlocksImages in src/agents/tool-images.ts
  • Max image side is configurable via agents.defaults.imageMaxDimensionPx (default: 1200).
  • Blank text blocks are removed while this pass walks replay content. Assistant turns that become empty are dropped from the replay copy; user and tool-result turns that become empty receive a non-empty omitted-content placeholder.

Global rule: malformed tool calls

Assistant tool-call blocks that are missing both input and arguments are dropped before model context is built. This prevents provider rejections from partially persisted tool calls (for example, after a rate limit failure).

Implementation:

  • sanitizeToolCallInputs in src/agents/session-transcript-repair.ts
  • Applied in sanitizeSessionHistory in src/agents/embedded-agent-runner/replay-history.ts

Global rule: inter-session input provenance

When an agent sends a prompt into another session via sessions_send (including agent-to-agent reply/announce steps), OpenClaw persists the created user turn with:

  • message.provenance.kind = "inter_session"

OpenClaw also prepends a same-turn [Inter-session message ... isUser=false] marker before the routed prompt text so the active model call can distinguish foreign session output from external end-user instructions. This marker includes the source session, channel, and tool when available. The transcript still uses role: "user" for provider compatibility, but the visible text and provenance metadata both mark the turn as inter-session data.

During context rebuild, OpenClaw applies the same marker to older persisted inter-session user turns that only have provenance metadata.


Provider matrix (current behavior)

OpenAI / OpenAI Codex

  • Image sanitization only.
  • Drop orphaned reasoning signatures (standalone reasoning items without a following content block) for OpenAI Responses/Codex transcripts, and drop replayable OpenAI reasoning after a model route switch.
  • Preserve replayable OpenAI Responses reasoning item payloads, including encrypted empty-summary items, so manual/WebSocket replay keeps required rs_* state paired with assistant output items.
  • Native ChatGPT Codex Responses follows Codex wire parity by replaying prior Responses reasoning/message/function payloads without prior item IDs while preserving session prompt_cache_key.
  • OpenAI Responses-family replay preserves canonical call_*|fc_* same-model reasoning pairs, but deterministically normalizes malformed or overlong call_id / function-call item ids before pi-ai payload conversion.
  • Tool result pairing repair may move real matched outputs and synthesize Codex-style aborted outputs for missing tool calls.
  • No turn validation or reordering.
  • Missing OpenAI Responses-family tool outputs are synthesized as aborted to match Codex replay normalization.
  • No thought signature stripping.

OpenAI-compatible Chat Completions

  • Historical assistant thinking/reasoning blocks are stripped before replay so local and proxy-style OpenAI-compatible servers do not receive prior-turn reasoning fields such as reasoning or reasoning_content.
  • Current same-turn tool-call continuations keep the assistant reasoning block attached to the tool call until the tool result has been replayed.
  • Provider-owned exceptions can opt out when their wire protocol requires replayed reasoning metadata.

Google (Generative AI / Gemini CLI / Antigravity)

  • Tool call id sanitization: strict alphanumeric.
  • Tool result pairing repair and synthetic tool results.
  • Turn validation (Gemini-style turn alternation).
  • Google turn ordering fixup (prepend a tiny user bootstrap if history starts with assistant).
  • Antigravity Claude: normalize thinking signatures; drop unsigned thinking blocks.

Anthropic / Minimax (Anthropic-compatible)

  • Tool result pairing repair and synthetic tool results.
  • Turn validation (merge consecutive user turns to satisfy strict alternation).
  • Trailing assistant prefill turns are stripped from outgoing Anthropic Messages payloads when thinking is enabled, including Cloudflare AI Gateway routes.
  • Thinking blocks with missing, empty, or blank replay signatures are stripped before provider conversion. If that empties an assistant turn, OpenClaw keeps turn shape with non-empty omitted-reasoning text.
  • Older thinking-only assistant turns that must be stripped are replaced with non-empty omitted-reasoning text so provider adapters do not drop the replay turn.

Amazon Bedrock (Converse API)

  • Empty assistant stream-error turns are repaired to a non-empty fallback text block before replay. Bedrock Converse rejects assistant messages with content: [], so persisted assistant turns with stopReason: "error" and empty content are also repaired on disk before load.
  • Assistant stream-error turns that contain only blank text blocks are dropped from the in-memory replay copy instead of replaying an invalid blank block.
  • Claude thinking blocks with missing, empty, or blank replay signatures are stripped before Converse replay. If that empties an assistant turn, OpenClaw keeps turn shape with non-empty omitted-reasoning text.
  • Older thinking-only assistant turns that must be stripped are replaced with non-empty omitted-reasoning text so the Converse replay keeps strict turn shape.
  • Replay filters OpenClaw delivery-mirror and gateway-injected assistant turns.
  • Image sanitization applies through the global rule.

Mistral (including model-id based detection)

  • Tool call id sanitization: strict9 (alphanumeric length 9).

OpenRouter Gemini

  • Thought signature cleanup: strip non-base64 thought_signature values (keep base64).

OpenRouter Anthropic

  • Trailing assistant prefill turns are stripped from verified OpenRouter OpenAI-compatible Anthropic model payloads when reasoning is enabled, matching direct Anthropic and Cloudflare Anthropic replay behavior.

Everything else

  • Image sanitization only.

Historical behavior (pre-2026.1.22)

Before the 2026.1.22 release, OpenClaw applied multiple layers of transcript hygiene:

  • A transcript-sanitize extension ran on every context build and could:
    • Repair tool use/result pairing.
    • Sanitize tool call ids (including a non-strict mode that preserved _/-).
  • The runner also performed provider-specific sanitization, which duplicated work.
  • Additional mutations occurred outside the provider policy, including:
    • Stripping <final> tags from assistant text before persistence.
    • Dropping empty assistant error turns.
    • Trimming assistant content after tool calls.

This complexity caused cross-provider regressions (notably openai-responses call_id|fc_id pairing). The 2026.1.22 cleanup removed the extension, centralized logic in the runner, and made OpenAI no-touch beyond image sanitization.