docs: refresh transcript sanitization mirrors

2026-04-10 00:31:22 +00:00 · 2026-04-04 21:52:10 +01:00
parent de918c282c
commit 291afbbb95
9 changed files with 28 additions and 15 deletions
--- a/docs/concepts/multi-agent.md
+++ b/docs/concepts/multi-agent.md
@@ -26,7 +26,8 @@ Auth profiles are **per-agent**. Each agent reads from its own:
 `sessions_history` is the safer cross-session recall path here too: it returns
 a bounded, sanitized view, not a raw transcript dump. Assistant recall strips
 thinking tags, `<relevant-memories>` scaffolding, plain-text tool-call XML
-payloads, downgraded tool-call scaffolding, leaked model control tokens, and
+payloads (including `<tool_calls>` and truncated tool-call blocks), downgraded
+tool-call scaffolding, leaked ASCII/full-width model control tokens, and
 malformed MiniMax tool-call XML before redaction/truncation.

 Main agent credentials are **not** shared automatically. Never reuse `agentDir`
--- a/docs/concepts/session-tool.md
+++ b/docs/concepts/session-tool.md
@@ -37,10 +37,13 @@ The returned view is intentionally bounded and safety-filtered:
 - assistant text is normalized before recall:
  - thinking tags are stripped
  - `<relevant-memories>` / `<relevant_memories>` scaffolding blocks are stripped
-  - plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>` / `<function_calls>...</function_calls>` are stripped
+  - plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>`,
+    `<tool_calls>...</tool_calls>`, and `<function_calls>...</function_calls>`
+    are stripped, including truncated payloads that never close cleanly
  - downgraded tool-call/result scaffolding such as `[Tool Call: ...]`,
    `[Tool Result ...]`, and `[Historical context ...]` is stripped
-  - leaked model control tokens such as `<|assistant|>` / `<｜...｜>` are stripped
+  - leaked model control tokens such as `<|assistant|>`, other ASCII
+    `<|...|>` tokens, and full-width `<｜...｜>` variants are stripped
  - malformed MiniMax tool-call XML such as `<invoke ...>` /
    `</minimax:tool_call>` is stripped
 - credential/token-like text is redacted before it is returned
--- a/docs/platforms/android.md
+++ b/docs/platforms/android.md
@@ -138,8 +138,10 @@ Pairing details: [Pairing](/channels/pairing).
 The Android Chat tab supports session selection (default `main`, plus other existing sessions):

 - History: `chat.history` (display-normalized; inline directive tags are
-  stripped from visible text, pure `NO_REPLY` assistant rows are omitted, and
-  oversized rows can be replaced with placeholders)
+  stripped from visible text, plain-text tool-call XML payloads and leaked
+  ASCII/full-width model control tokens are stripped, pure `NO_REPLY`
+  assistant rows are omitted, and oversized rows can be replaced with
+  placeholders)
 - Send: `chat.send`
 - Push updates (best-effort): `chat.subscribe` → `event:"chat"`

--- a/docs/platforms/mac/webchat.md
+++ b/docs/platforms/mac/webchat.md
@@ -31,8 +31,10 @@ agent (with a session switcher for other sessions).
 - Data plane: Gateway WS methods `chat.history`, `chat.send`, `chat.abort`,
  `chat.inject` and events `chat`, `agent`, `presence`, `tick`, `health`.
 - `chat.history` returns display-normalized transcript rows: inline directive
-  tags are stripped from visible text, pure `NO_REPLY` assistant rows are
-  omitted, and oversized rows can be replaced with placeholders.
+  tags are stripped from visible text, plain-text tool-call XML payloads and
+  leaked ASCII/full-width model control tokens are stripped, pure `NO_REPLY`
+  assistant rows are omitted, and oversized rows can be replaced with
+  placeholders.
 - Session: defaults to the primary session (`main`, or `global` when scope is
  global). The UI can switch between sessions.
 - Onboarding uses a dedicated session to keep first‑run setup separate.
--- a/docs/tools/index.md
+++ b/docs/tools/index.md
@@ -149,7 +149,8 @@ Use `group:*` shorthands in allow/deny lists:

 `sessions_history` returns a bounded, safety-filtered recall view. It strips
 thinking tags, `<relevant-memories>` scaffolding, plain-text tool-call XML
-payloads, downgraded tool-call scaffolding, leaked model control tokens, and
+payloads (including `<tool_calls>` and truncated tool-call blocks), downgraded
+tool-call scaffolding, leaked ASCII/full-width model control tokens, and
 malformed MiniMax tool-call XML from assistant text, then applies
 redaction/truncation and possible oversized-row placeholders instead of acting
 as a raw transcript dump.
--- a/docs/tools/multi-agent-sandbox-tools.md
+++ b/docs/tools/multi-agent-sandbox-tools.md
@@ -297,9 +297,10 @@ Legacy `agent.*` configs are migrated by `openclaw doctor`; prefer `agents.defau

 `sessions_history` in this profile still returns a bounded, sanitized recall
 view rather than a raw transcript dump. Assistant recall strips thinking tags,
-`<relevant-memories>` scaffolding, plain-text tool-call XML payloads,
-downgraded tool-call scaffolding, leaked model control tokens, and malformed
-MiniMax tool-call XML before redaction/truncation.
+`<relevant-memories>` scaffolding, plain-text tool-call XML payloads
+(including `<tool_calls>` and truncated tool-call blocks), downgraded
+tool-call scaffolding, leaked ASCII/full-width model control tokens, and
+malformed MiniMax tool-call XML before redaction/truncation.

 ---

--- a/docs/tools/subagents.md
+++ b/docs/tools/subagents.md
@@ -266,9 +266,12 @@ Announce payloads include a stats line at the end (even when wrapped):
 - assistant recall is normalized first:
  - thinking tags are stripped
  - `<relevant-memories>` / `<relevant_memories>` scaffolding blocks are stripped
-  - plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>` / `<function_calls>...</function_calls>` are stripped
+  - plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>`,
+    `<tool_calls>...</tool_calls>`, and `<function_calls>...</function_calls>`
+    are stripped, including truncated payloads that never close cleanly
  - downgraded tool-call/result scaffolding and historical-context markers are stripped
-  - leaked model control tokens such as `<|assistant|>` are stripped
+  - leaked model control tokens such as `<|assistant|>`, other ASCII
+    `<|...|>` tokens, and full-width `<｜...｜>` variants are stripped
  - malformed MiniMax tool-call XML is stripped
 - credential/token-like text is redacted
 - long blocks can be truncated
--- a/docs/web/control-ui.md
+++ b/docs/web/control-ui.md
@@ -123,7 +123,7 @@ Cron jobs panel notes:
 - `chat.send` is **non-blocking**: it acks immediately with `{ runId, status: "started" }` and the response streams via `chat` events.
 - Re-sending with the same `idempotencyKey` returns `{ status: "in_flight" }` while running, and `{ status: "ok" }` after completion.
 - `chat.history` responses are size-bounded for UI safety. When transcript entries are too large, Gateway may truncate long text fields, omit heavy metadata blocks, and replace oversized messages with a placeholder (`[chat.history omitted: message too large]`).
- `chat.history` also strips display-only inline directive tags from visible assistant text (for example `[[reply_to_*]]` and `[[audio_as_voice]]`) and omits assistant entries whose whole visible text is only `NO_REPLY`.
+- `chat.history` also strips display-only inline directive tags from visible assistant text (for example `[[reply_to_*]]` and `[[audio_as_voice]]`), plain-text tool-call XML payloads (including `<tool_calls>` and truncated tool-call blocks), and leaked ASCII/full-width model control tokens, and omits assistant entries whose whole visible text is only `NO_REPLY`.
 - `chat.inject` appends an assistant note to the session transcript and broadcasts a `chat` event for UI-only updates (no agent run, no channel delivery).
 - The chat header model and thinking pickers patch the active session immediately through `sessions.patch`; they are persistent session overrides, not one-turn-only send options.
 - Stop:
--- a/docs/web/webchat.md
+++ b/docs/web/webchat.md
@@ -26,7 +26,7 @@ Status: the macOS/iOS SwiftUI chat UI talks directly to the Gateway WebSocket.

 - The UI connects to the Gateway WebSocket and uses `chat.history`, `chat.send`, and `chat.inject`.
 - `chat.history` is bounded for stability: Gateway may truncate long text fields, omit heavy metadata, and replace oversized entries with `[chat.history omitted: message too large]`.
- `chat.history` is also display-normalized: inline delivery directive tags such as `[[reply_to_*]]` and `[[audio_as_voice]]` are stripped from visible text, and assistant entries whose whole visible text is only `NO_REPLY` are omitted.
+- `chat.history` is also display-normalized: inline delivery directive tags such as `[[reply_to_*]]` and `[[audio_as_voice]]`, plain-text tool-call XML payloads (including `<tool_calls>` and truncated tool-call blocks), and leaked ASCII/full-width model control tokens are stripped from visible text, and assistant entries whose whole visible text is only `NO_REPLY` are omitted.
 - `chat.inject` appends an assistant note directly to the transcript and broadcasts it to the UI (no agent run).
 - Aborted runs can keep partial assistant output visible in the UI.
 - Gateway persists aborted partial assistant text into transcript history when buffered output exists, and marks those entries with abort metadata.