docs: refresh transcript sanitization mirrors

This commit is contained in:
Peter Steinberger
2026-04-04 21:52:10 +01:00
parent de918c282c
commit 291afbbb95
9 changed files with 28 additions and 15 deletions

View File

@@ -26,7 +26,8 @@ Auth profiles are **per-agent**. Each agent reads from its own:
`sessions_history` is the safer cross-session recall path here too: it returns
a bounded, sanitized view, not a raw transcript dump. Assistant recall strips
thinking tags, `<relevant-memories>` scaffolding, plain-text tool-call XML
payloads, downgraded tool-call scaffolding, leaked model control tokens, and
payloads (including `<tool_calls>` and truncated tool-call blocks), downgraded
tool-call scaffolding, leaked ASCII/full-width model control tokens, and
malformed MiniMax tool-call XML before redaction/truncation.
Main agent credentials are **not** shared automatically. Never reuse `agentDir`

View File

@@ -37,10 +37,13 @@ The returned view is intentionally bounded and safety-filtered:
- assistant text is normalized before recall:
- thinking tags are stripped
- `<relevant-memories>` / `<relevant_memories>` scaffolding blocks are stripped
- plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>` / `<function_calls>...</function_calls>` are stripped
- plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>`,
`<tool_calls>...</tool_calls>`, and `<function_calls>...</function_calls>`
are stripped, including truncated payloads that never close cleanly
- downgraded tool-call/result scaffolding such as `[Tool Call: ...]`,
`[Tool Result ...]`, and `[Historical context ...]` is stripped
- leaked model control tokens such as `<|assistant|>` / `<...>` are stripped
- leaked model control tokens such as `<|assistant|>`, other ASCII
`<|...|>` tokens, and full-width `<...>` variants are stripped
- malformed MiniMax tool-call XML such as `<invoke ...>` /
`</minimax:tool_call>` is stripped
- credential/token-like text is redacted before it is returned

View File

@@ -138,8 +138,10 @@ Pairing details: [Pairing](/channels/pairing).
The Android Chat tab supports session selection (default `main`, plus other existing sessions):
- History: `chat.history` (display-normalized; inline directive tags are
stripped from visible text, pure `NO_REPLY` assistant rows are omitted, and
oversized rows can be replaced with placeholders)
stripped from visible text, plain-text tool-call XML payloads and leaked
ASCII/full-width model control tokens are stripped, pure `NO_REPLY`
assistant rows are omitted, and oversized rows can be replaced with
placeholders)
- Send: `chat.send`
- Push updates (best-effort): `chat.subscribe` → `event:"chat"`

View File

@@ -31,8 +31,10 @@ agent (with a session switcher for other sessions).
- Data plane: Gateway WS methods `chat.history`, `chat.send`, `chat.abort`,
`chat.inject` and events `chat`, `agent`, `presence`, `tick`, `health`.
- `chat.history` returns display-normalized transcript rows: inline directive
tags are stripped from visible text, pure `NO_REPLY` assistant rows are
omitted, and oversized rows can be replaced with placeholders.
tags are stripped from visible text, plain-text tool-call XML payloads and
leaked ASCII/full-width model control tokens are stripped, pure `NO_REPLY`
assistant rows are omitted, and oversized rows can be replaced with
placeholders.
- Session: defaults to the primary session (`main`, or `global` when scope is
global). The UI can switch between sessions.
- Onboarding uses a dedicated session to keep firstrun setup separate.

View File

@@ -149,7 +149,8 @@ Use `group:*` shorthands in allow/deny lists:
`sessions_history` returns a bounded, safety-filtered recall view. It strips
thinking tags, `<relevant-memories>` scaffolding, plain-text tool-call XML
payloads, downgraded tool-call scaffolding, leaked model control tokens, and
payloads (including `<tool_calls>` and truncated tool-call blocks), downgraded
tool-call scaffolding, leaked ASCII/full-width model control tokens, and
malformed MiniMax tool-call XML from assistant text, then applies
redaction/truncation and possible oversized-row placeholders instead of acting
as a raw transcript dump.

View File

@@ -297,9 +297,10 @@ Legacy `agent.*` configs are migrated by `openclaw doctor`; prefer `agents.defau
`sessions_history` in this profile still returns a bounded, sanitized recall
view rather than a raw transcript dump. Assistant recall strips thinking tags,
`<relevant-memories>` scaffolding, plain-text tool-call XML payloads,
downgraded tool-call scaffolding, leaked model control tokens, and malformed
MiniMax tool-call XML before redaction/truncation.
`<relevant-memories>` scaffolding, plain-text tool-call XML payloads
(including `<tool_calls>` and truncated tool-call blocks), downgraded
tool-call scaffolding, leaked ASCII/full-width model control tokens, and
malformed MiniMax tool-call XML before redaction/truncation.
---

View File

@@ -266,9 +266,12 @@ Announce payloads include a stats line at the end (even when wrapped):
- assistant recall is normalized first:
- thinking tags are stripped
- `<relevant-memories>` / `<relevant_memories>` scaffolding blocks are stripped
- plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>` / `<function_calls>...</function_calls>` are stripped
- plain-text tool-call XML payload blocks such as `<tool_call>...</tool_call>`,
`<tool_calls>...</tool_calls>`, and `<function_calls>...</function_calls>`
are stripped, including truncated payloads that never close cleanly
- downgraded tool-call/result scaffolding and historical-context markers are stripped
- leaked model control tokens such as `<|assistant|>` are stripped
- leaked model control tokens such as `<|assistant|>`, other ASCII
`<|...|>` tokens, and full-width `<...>` variants are stripped
- malformed MiniMax tool-call XML is stripped
- credential/token-like text is redacted
- long blocks can be truncated

View File

@@ -123,7 +123,7 @@ Cron jobs panel notes:
- `chat.send` is **non-blocking**: it acks immediately with `{ runId, status: "started" }` and the response streams via `chat` events.
- Re-sending with the same `idempotencyKey` returns `{ status: "in_flight" }` while running, and `{ status: "ok" }` after completion.
- `chat.history` responses are size-bounded for UI safety. When transcript entries are too large, Gateway may truncate long text fields, omit heavy metadata blocks, and replace oversized messages with a placeholder (`[chat.history omitted: message too large]`).
- `chat.history` also strips display-only inline directive tags from visible assistant text (for example `[[reply_to_*]]` and `[[audio_as_voice]]`) and omits assistant entries whose whole visible text is only `NO_REPLY`.
- `chat.history` also strips display-only inline directive tags from visible assistant text (for example `[[reply_to_*]]` and `[[audio_as_voice]]`), plain-text tool-call XML payloads (including `<tool_calls>` and truncated tool-call blocks), and leaked ASCII/full-width model control tokens, and omits assistant entries whose whole visible text is only `NO_REPLY`.
- `chat.inject` appends an assistant note to the session transcript and broadcasts a `chat` event for UI-only updates (no agent run, no channel delivery).
- The chat header model and thinking pickers patch the active session immediately through `sessions.patch`; they are persistent session overrides, not one-turn-only send options.
- Stop:

View File

@@ -26,7 +26,7 @@ Status: the macOS/iOS SwiftUI chat UI talks directly to the Gateway WebSocket.
- The UI connects to the Gateway WebSocket and uses `chat.history`, `chat.send`, and `chat.inject`.
- `chat.history` is bounded for stability: Gateway may truncate long text fields, omit heavy metadata, and replace oversized entries with `[chat.history omitted: message too large]`.
- `chat.history` is also display-normalized: inline delivery directive tags such as `[[reply_to_*]]` and `[[audio_as_voice]]` are stripped from visible text, and assistant entries whose whole visible text is only `NO_REPLY` are omitted.
- `chat.history` is also display-normalized: inline delivery directive tags such as `[[reply_to_*]]` and `[[audio_as_voice]]`, plain-text tool-call XML payloads (including `<tool_calls>` and truncated tool-call blocks), and leaked ASCII/full-width model control tokens are stripped from visible text, and assistant entries whose whole visible text is only `NO_REPLY` are omitted.
- `chat.inject` appends an assistant note directly to the transcript and broadcasts it to the UI (no agent run).
- Aborted runs can keep partial assistant output visible in the UI.
- Gateway persists aborted partial assistant text into transcript history when buffered output exists, and marks those entries with abort metadata.