mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-19 21:04:46 +00:00
* refactor: remove stale file-backed shims * fix: harden sqlite state ci boundaries * refactor: store matrix idb snapshots in sqlite * fix: satisfy rebased CI guardrails * refactor: store current conversation bindings in sqlite table * refactor: store tui last sessions in sqlite table * refactor: reset sqlite schema history * refactor: drop unshipped sqlite table migration * refactor: remove plugin index file rollback * refactor: drop unshipped sqlite sidecar migrations * refactor: remove runtime commitments kv migration * refactor: preserve kysely sync result types * refactor: drop unshipped sqlite schema migration table * test: keep session usage coverage sqlite-backed * refactor: keep sqlite migration doctor-only * refactor: isolate device legacy imports * refactor: isolate push voicewake legacy imports * refactor: isolate remaining runtime legacy imports * refactor: tighten sqlite migration guardrails * test: cover sqlite persisted enum parsing * refactor: isolate legacy update and tui imports * refactor: tighten sqlite state ownership * refactor: move legacy imports behind doctor * refactor: remove legacy session row lookup * refactor: canonicalize memory transcript locators * refactor: drop transcript path scope fallbacks * refactor: drop runtime legacy session delivery pruning * refactor: store tts prefs only in sqlite * refactor: remove cron store path runtime * refactor: use cron sqlite store keys * refactor: rename telegram message cache scope * refactor: read memory dreaming status from sqlite * refactor: rename cron status store key * refactor: stop remembering transcript file paths * test: use sqlite locators in agent fixtures * refactor: remove file-shaped commitments and cron store surfaces * refactor: keep compaction transcript handles out of session rows * refactor: derive transcript handles from session identity * refactor: derive runtime transcript handles * refactor: remove gateway session locator reads * refactor: remove transcript locator from session rows * refactor: store raw stream diagnostics in sqlite * refactor: remove file-shaped transcript rotation * refactor: hide legacy trajectory paths from runtime * refactor: remove runtime transcript file bridges * refactor: repair database-first rebase fallout * refactor: align tests with database-first state * refactor: remove transcript file handoffs * refactor: sync post-compaction memory by transcript scope * refactor: run codex app-server sessions by id * refactor: bind codex runtime state by session id * refactor: pass memory transcripts by sqlite scope * refactor: remove transcript locator cleanup leftovers * test: remove stale transcript file fixtures * refactor: remove transcript locator test helper * test: make cron sqlite keys explicit * test: remove cron runtime store paths * test: remove stale session file fixtures * test: use sqlite cron keys in diagnostics * refactor: remove runtime delivery queue backfill * test: drop fake export session file mocks * refactor: rename acp session read failure flag * refactor: rename acp row session key * refactor: remove session store test seams * refactor: move legacy session parser tests to doctor * refactor: reindex managed memory in place * refactor: drop stale session store wording * refactor: rename session row helpers * refactor: rename sqlite session entry modules * refactor: remove transcript locator leftovers * refactor: trim file-era audit wording * refactor: clean managed media through sqlite * fix: prefer explicit agent for exports * fix: use prepared agent for session resets * fix: canonicalize legacy codex binding import * test: rename state cleanup helper * docs: align backup docs with sqlite state * refactor: drop legacy Pi usage auth fallback * refactor: move legacy auth profile imports to doctor * refactor: keep Pi model discovery auth in memory * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime * refactor: remove model json compatibility aliases * refactor: store auth profiles in sqlite * refactor: seed copied auth profiles in sqlite * refactor: make auth profile runtime sqlite-addressed * refactor: migrate hermes secrets into sqlite auth store * refactor: move plugin install config migration to doctor * refactor: rename plugin index audit checks * test: drop auth file assumptions * test: remove legacy transcript file assertions * refactor: drop legacy cli session aliases * refactor: store skill uploads in sqlite * refactor: keep subagent attachments in sqlite vfs * refactor: drop subagent attachment cleanup state * refactor: move legacy session aliases to doctor * refactor: require node 24 for sqlite state runtime * refactor: move provider caches into sqlite state * fix: harden virtual agent filesystem * refactor: enforce database-first runtime state * refactor: rename compaction transcript rotation setting * test: clean sqlite refactor test types * refactor: consolidate sqlite runtime state * refactor: model session conversations in sqlite * refactor: stop deriving cron delivery from session keys * refactor: stop classifying sessions from key shape * refactor: hydrate announce targets from typed delivery * refactor: route heartbeat delivery from typed sqlite context * refactor: tighten typed sqlite session routing * refactor: remove session origin routing shadow * refactor: drop session origin shadow fixtures * perf: query sqlite vfs paths by prefix * refactor: use typed conversation metadata for sessions * refactor: prefer typed session routing metadata * refactor: require typed session routing metadata * refactor: resolve group tool policy from typed sessions * refactor: delete dead session thread info bridge * Show Codex subscription reset times in channel errors (#80456) * feat(plugin-sdk): consolidate session workflow APIs * fix(agents): allow read-only agent mount reads * [codex] refresh plugin regression fixtures * fix(agents): restore compaction gateway logs * test: tighten gateway startup assertions * Redact persisted secret-shaped payloads [AI] (#79006) * test: tighten device pair notify assertions * test: tighten hermes secret assertions * test: assert matrix client error shapes * test: assert config compat warnings * fix(heartbeat): remap cron-run exec events to session keys (#80214) * fix(codex): route btw through native side threads * fix(auth): accept friendly OpenAI order for Codex profiles * fix(codex): rotate auth profiles inside harness * fix: keep browser status page probe within timeout * test: assert agents add outputs * test: pin cron read status * fix(agents): avoid Pi resource discovery stalls Co-authored-by: dataCenter430 <titan032000@gmail.com> * fix: retire timed-out codex app-server clients * test: tighten qa lab runtime assertions * test: check security fix outputs * test: verify extension runtime messages * feat(wake): expose typed sessionKey on wake protocol + system event CLI * fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790) * test: guard talk consult call helper * fix(codex): scale context engine projection (#80761) * fix(codex): scale context engine projection * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * chore: align Codex projection changelog * chore: realign Codex projection changelog * fix: isolate Codex projection patch --------- Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> * refactor: move agent runtime state toward piless * refactor: remove cron session reaper * refactor: move session management to sqlite * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: remove stale file-backed shims * test: harden kysely type coverage # Conflicts: # .agents/skills/kysely-database-access/SKILL.md # src/infra/kysely-sync.types.test.ts # src/proxy-capture/store.sqlite.test.ts # src/state/openclaw-agent-db.test.ts # src/state/openclaw-state-db.test.ts * refactor: remove cron store path runtime * refactor: keep compaction transcript handles out of session rows * refactor: derive embedded transcripts from sqlite identity * refactor: remove embedded transcript locator handoff * refactor: remove runtime transcript file bridges * refactor: remove transcript file handoffs * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime # Conflicts: # docs/cli/secrets.md # docs/gateway/authentication.md # docs/gateway/secrets.md * fix: keep oauth sibling sync sqlite-local # Conflicts: # src/commands/onboard-auth.test.ts * refactor: remove task session store maintenance # Conflicts: # src/commands/tasks.ts * refactor: keep diagnostics in state sqlite * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * Show Codex subscription reset times in channel errors (#80456) * fix(codex): refresh subscription limit resets * fix(codex): format reset times for channels * Update CHANGELOG with latest changes and fixes Updated CHANGELOG with recent fixes and improvements. * fix(codex): keep command load failures on codex surface * fix(codex): format account rate limits as rows * fix(codex): summarize account limits as usage status * fix(codex): simplify account limit status * test: tighten subagent announce queue assertion * test: tighten session delete lifecycle assertions * test: tighten cron ops assertions * fix: track cron execution milestones * test: tighten hermes secret assertions * test: assert matrix sync store payloads * test: assert config compat warnings * fix(codex): align btw side thread semantics * fix(codex): honor codex fallback blocking * fix(agents): avoid Pi resource discovery stalls * test: tighten codex event assertions * test: tighten cron assertions * Fix Codex app-server OAuth harness auth * refactor: move agent runtime state toward piless * refactor: move device and push state to sqlite * refactor: move runtime json state imports to doctor * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: clarify cron sqlite store keys * refactor: remove stale file-backed shims * refactor: bind codex runtime state by session id * test: expect sqlite trajectory branch export * refactor: rename session row helpers * fix: keep legacy device identity import in doctor * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * build: align pi contract wrappers * chore: repair database-first rebase * refactor: remove session file test contracts * test: update gateway session expectations * refactor: stop routing from session compatibility shadows * refactor: stop persisting session route shadows * refactor: use typed delivery context in clients * refactor: stop echoing session route shadows * refactor: repair embedded runner rebase imports # Conflicts: # src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts * refactor: align pi contract imports * refactor: satisfy kysely sync helper guard * refactor: remove file transcript bridge remnants * refactor: remove session locator compatibility * refactor: remove session file test contracts * refactor: keep rebase database-first clean * refactor: remove session file assumptions from e2e * docs: clarify database-first goal state * test: remove legacy store markers from sqlite runtime tests * refactor: remove legacy store assumptions from runtime seams * refactor: align sqlite runtime helper seams * test: update memory recall sqlite audit mock * refactor: align database-first runtime type seams * test: clarify doctor cron legacy store names * fix: preserve sqlite session route projections * test: fix copilot token cache test syntax * docs: update database-first proof status * test: align database-first test fixtures * docs: update database-first proof status * refactor: clean extension database-first drift * test: align agent session route proof * test: clarify doctor legacy path fixtures * chore: clean database-first changed checks * chore: repair database-first rebase markers * build: allow baileys git subdependency * chore: repair exp-vfs rebase drift * chore: finish exp-vfs rebase cleanup * chore: satisfy rebase lint drift * chore: fix qqbot rebase type seam * chore: fix rebase drift leftovers * fix: keep auth profile oauth secrets out of sqlite * fix: repair rebase drift tests * test: stabilize pairing request ordering * test: use source manifests in plugin contract checks * fix: restore gateway session metadata after rebase * fix: repair database-first rebase drift * fix: clean up database-first rebase fallout * test: stabilize line quick reply receipt time * fix: repair extension rebase drift * test: keep transcript redaction tests sqlite-backed * fix: carry injected transcript redaction through sqlite * chore: clean database branch rebase residue * fix: repair database branch CI drift * fix: repair database branch CI guard drift * fix: stabilize oauth tls preflight test * test: align database branch fast guards * test: repair build artifact boundary guards * chore: clean changelog rebase markers --------- Co-authored-by: pashpashpash <nik@vault77.ai> Co-authored-by: Eva <eva@100yen.org> Co-authored-by: stainlu <stainlu@newtype-ai.org> Co-authored-by: Jason Zhou <jason.zhou.design@gmail.com> Co-authored-by: Ruben Cuevas <hi@rubencu.com> Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com> Co-authored-by: Shakker <shakkerdroid@gmail.com> Co-authored-by: Kaspre <36520309+Kaspre@users.noreply.github.com> Co-authored-by: dataCenter430 <titan032000@gmail.com> Co-authored-by: Kaspre <kaspre@gmail.com> Co-authored-by: pandadev66 <nova.full.stack@outlook.com> Co-authored-by: Eva <admin@100yen.org> Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> Co-authored-by: jeffjhunter <support@aipersonamethod.com>
215 lines
9.7 KiB
Markdown
215 lines
9.7 KiB
Markdown
---
|
|
summary: "Reference: provider-specific transcript sanitization and repair rules"
|
|
read_when:
|
|
- You are debugging provider request rejections tied to transcript shape
|
|
- You are changing transcript sanitization or tool-call repair logic
|
|
- You are investigating tool-call id mismatches across providers
|
|
title: "Transcript hygiene"
|
|
---
|
|
|
|
OpenClaw applies **provider-specific fixes** to transcripts before a run (building model context). Most of these are **in-memory** adjustments used to satisfy strict provider requirements. A separate transcript-state repair pass may also normalize stored SQLite transcript rows before load, but only for malformed entries or persisted turns that are invalid durable records. Delivered assistant replies are preserved in the transcript store; provider-specific assistant-prefill stripping happens only while constructing outbound payloads.
|
|
|
|
Scope includes:
|
|
|
|
- Runtime-only prompt context staying out of user-visible transcript turns
|
|
- Tool call id sanitization
|
|
- Tool call input validation
|
|
- Tool result pairing repair
|
|
- Turn validation / ordering
|
|
- Thought signature cleanup
|
|
- Thinking signature cleanup
|
|
- Image payload sanitization
|
|
- Blank text-block cleanup before provider replay
|
|
- User-input provenance tagging (for inter-session routed prompts)
|
|
- Empty assistant error-turn repair for Bedrock Converse replay
|
|
|
|
If you need transcript storage details, see:
|
|
|
|
- [Session management deep dive](/reference/session-management-compaction)
|
|
|
|
---
|
|
|
|
## Global rule: runtime context is not user transcript
|
|
|
|
Runtime/system context can be added to the model prompt for a turn, but it is
|
|
not end-user-authored content. OpenClaw keeps a separate transcript-facing
|
|
prompt body for Gateway replies, queued followups, ACP, CLI, and embedded Pi
|
|
runs. Stored visible user turns use that transcript body instead of the
|
|
runtime-enriched prompt.
|
|
|
|
For legacy sessions that already persisted runtime wrappers, Gateway history
|
|
surfaces apply a display projection before returning messages to WebChat,
|
|
TUI, REST, or SSE clients.
|
|
|
|
---
|
|
|
|
## Where this runs
|
|
|
|
All transcript hygiene is centralized in the embedded runner:
|
|
|
|
- Policy selection: `src/agents/transcript-policy.ts`
|
|
- Sanitization/repair application: `sanitizeSessionHistory` in `src/agents/pi-embedded-runner/replay-history.ts`
|
|
|
|
The policy uses `provider`, `modelApi`, and `modelId` to decide what to apply.
|
|
|
|
Separate from transcript hygiene, SQLite transcript rows are normalized before load:
|
|
|
|
- `repairTranscriptStateIfNeeded` in `src/agents/transcript-state-repair.ts`
|
|
- Called from `run/attempt.ts` and `compact.ts` (embedded runner)
|
|
|
|
---
|
|
|
|
## Global rule: image sanitization
|
|
|
|
Image payloads are always sanitized to prevent provider-side rejection due to size
|
|
limits (downscale/recompress oversized base64 images).
|
|
|
|
This also helps control image-driven token pressure for vision-capable models.
|
|
Lower max dimensions generally reduce token usage; higher dimensions preserve detail.
|
|
|
|
Implementation:
|
|
|
|
- `sanitizeSessionMessagesImages` in `src/agents/pi-embedded-helpers/images.ts`
|
|
- `sanitizeContentBlocksImages` in `src/agents/tool-images.ts`
|
|
- Max image side is configurable via `agents.defaults.imageMaxDimensionPx` (default: `1200`).
|
|
- Blank text blocks are removed while this pass walks replay content. Assistant
|
|
turns that become empty are dropped from the replay copy; user and tool-result
|
|
turns that become empty receive a non-empty omitted-content placeholder.
|
|
|
|
---
|
|
|
|
## Global rule: malformed tool calls
|
|
|
|
Assistant tool-call blocks that are missing both `input` and `arguments` are dropped
|
|
before model context is built. This prevents provider rejections from partially
|
|
persisted tool calls (for example, after a rate limit failure).
|
|
|
|
Implementation:
|
|
|
|
- `sanitizeToolCallInputs` in `src/agents/session-transcript-repair.ts`
|
|
- Applied in `sanitizeSessionHistory` in `src/agents/pi-embedded-runner/replay-history.ts`
|
|
|
|
---
|
|
|
|
## Global rule: inter-session input provenance
|
|
|
|
When an agent sends a prompt into another session via `sessions_send` (including
|
|
agent-to-agent reply/announce steps), OpenClaw persists the created user turn with:
|
|
|
|
- `message.provenance.kind = "inter_session"`
|
|
|
|
OpenClaw also prepends a same-turn `[Inter-session message ... isUser=false]`
|
|
marker before the routed prompt text so the active model call can distinguish
|
|
foreign session output from external end-user instructions. This marker includes
|
|
the source session, channel, and tool when available. The transcript still uses
|
|
`role: "user"` for provider compatibility, but the visible text and provenance
|
|
metadata both mark the turn as inter-session data.
|
|
|
|
During context rebuild, OpenClaw applies the same marker to older persisted
|
|
inter-session user turns that only have provenance metadata.
|
|
|
|
---
|
|
|
|
## Provider matrix (current behavior)
|
|
|
|
**OpenAI / OpenAI Codex**
|
|
|
|
- Image sanitization only.
|
|
- Drop orphaned reasoning signatures (standalone reasoning items without a following content block) for OpenAI Responses/Codex transcripts, and drop replayable OpenAI reasoning after a model route switch.
|
|
- Preserve replayable OpenAI Responses reasoning item payloads, including encrypted empty-summary items, so manual/WebSocket replay keeps required `rs_*` state paired with assistant output items.
|
|
- Native ChatGPT Codex Responses follows Codex wire parity by replaying prior Responses reasoning/message/function payloads without prior item IDs while preserving session `prompt_cache_key`.
|
|
- No tool call id sanitization.
|
|
- Tool result pairing repair may move real matched outputs and synthesize Codex-style `aborted` outputs for missing tool calls.
|
|
- No turn validation or reordering.
|
|
- Missing OpenAI Responses-family tool outputs are synthesized as `aborted` to match Codex replay normalization.
|
|
- No thought signature stripping.
|
|
|
|
**OpenAI-compatible Chat Completions**
|
|
|
|
- Historical assistant thinking/reasoning blocks are stripped before replay so
|
|
local and proxy-style OpenAI-compatible servers do not receive prior-turn
|
|
reasoning fields such as `reasoning` or `reasoning_content`.
|
|
- Current same-turn tool-call continuations keep the assistant reasoning block
|
|
attached to the tool call until the tool result has been replayed.
|
|
- Provider-owned exceptions can opt out when their wire protocol requires
|
|
replayed reasoning metadata.
|
|
|
|
**Google (Generative AI / Gemini CLI / Antigravity)**
|
|
|
|
- Tool call id sanitization: strict alphanumeric.
|
|
- Tool result pairing repair and synthetic tool results.
|
|
- Turn validation (Gemini-style turn alternation).
|
|
- Google turn ordering fixup (prepend a tiny user bootstrap if history starts with assistant).
|
|
- Antigravity Claude: normalize thinking signatures; drop unsigned thinking blocks.
|
|
|
|
**Anthropic / Minimax (Anthropic-compatible)**
|
|
|
|
- Tool result pairing repair and synthetic tool results.
|
|
- Turn validation (merge consecutive user turns to satisfy strict alternation).
|
|
- Trailing assistant prefill turns are stripped from outgoing Anthropic Messages
|
|
payloads when thinking is enabled, including Cloudflare AI Gateway routes.
|
|
- Thinking blocks with missing, empty, or blank replay signatures are stripped
|
|
before provider conversion. If that empties an assistant turn, OpenClaw keeps
|
|
turn shape with non-empty omitted-reasoning text.
|
|
- Older thinking-only assistant turns that must be stripped are replaced with
|
|
non-empty omitted-reasoning text so provider adapters do not drop the replay
|
|
turn.
|
|
|
|
**Amazon Bedrock (Converse API)**
|
|
|
|
- Empty assistant stream-error turns are repaired to a non-empty fallback text block
|
|
before replay. Bedrock Converse rejects assistant messages with `content: []`, so
|
|
persisted assistant turns with `stopReason: "error"` and empty content are also
|
|
repaired on disk before load.
|
|
- Assistant stream-error turns that contain only blank text blocks are dropped
|
|
from the in-memory replay copy instead of replaying an invalid blank block.
|
|
- Claude thinking blocks with missing, empty, or blank replay signatures are
|
|
stripped before Converse replay. If that empties an assistant turn, OpenClaw
|
|
keeps turn shape with non-empty omitted-reasoning text.
|
|
- Older thinking-only assistant turns that must be stripped are replaced with
|
|
non-empty omitted-reasoning text so the Converse replay keeps strict turn shape.
|
|
- Replay filters OpenClaw delivery-mirror and gateway-injected assistant turns.
|
|
- Image sanitization applies through the global rule.
|
|
|
|
**Mistral (including model-id based detection)**
|
|
|
|
- Tool call id sanitization: strict9 (alphanumeric length 9).
|
|
|
|
**OpenRouter Gemini**
|
|
|
|
- Thought signature cleanup: strip non-base64 `thought_signature` values (keep base64).
|
|
|
|
**OpenRouter Anthropic**
|
|
|
|
- Trailing assistant prefill turns are stripped from verified OpenRouter
|
|
OpenAI-compatible Anthropic model payloads when reasoning is enabled, matching
|
|
direct Anthropic and Cloudflare Anthropic replay behavior.
|
|
|
|
**Everything else**
|
|
|
|
- Image sanitization only.
|
|
|
|
---
|
|
|
|
## Historical behavior (pre-2026.1.22)
|
|
|
|
Before the 2026.1.22 release, OpenClaw applied multiple layers of transcript hygiene:
|
|
|
|
- A **transcript-sanitize extension** ran on every context build and could:
|
|
- Repair tool use/result pairing.
|
|
- Sanitize tool call ids (including a non-strict mode that preserved `_`/`-`).
|
|
- The runner also performed provider-specific sanitization, which duplicated work.
|
|
- Additional mutations occurred outside the provider policy, including:
|
|
- Stripping `<final>` tags from assistant text before persistence.
|
|
- Dropping empty assistant error turns.
|
|
- Trimming assistant content after tool calls.
|
|
|
|
This complexity caused cross-provider regressions (notably `openai-responses`
|
|
`call_id|fc_id` pairing). The 2026.1.22 cleanup removed the extension, centralized
|
|
logic in the runner, and made OpenAI **no-touch** beyond image sanitization.
|
|
|
|
## Related
|
|
|
|
- [Session management](/concepts/session)
|
|
- [Session pruning](/concepts/session-pruning)
|