From 2ab8acb2c9ad549ced915fee2e4f1664231f48d8 Mon Sep 17 00:00:00 2001
From: Peter Steinberger <steipete@gmail.com>
Date: Sat, 4 Apr 2026 18:25:03 +0100
Subject: [PATCH] docs: refresh chat thinking and compaction refs

---
 docs/concepts/compaction.md                     |  5 +++++
 docs/concepts/typing-indicators.md              |  5 +++--
 docs/reference/session-management-compaction.md | 17 ++++++++++++++++-
 docs/tools/thinking.md                          |  8 ++++++--
 docs/web/control-ui.md                          |  3 ++-
 5 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/docs/concepts/compaction.md b/docs/concepts/compaction.md
index 5efb21b3f03..2ce5b64bd75 100644
--- a/docs/concepts/compaction.md
+++ b/docs/concepts/compaction.md
@@ -18,6 +18,11 @@ into a summary so the chat can continue.
 2. The summary is saved in the session transcript.
 3. Recent messages are kept intact.
 
+When OpenClaw splits history into compaction chunks, it keeps assistant tool
+calls paired with their matching `toolResult` entries. If a split point lands
+inside a tool block, OpenClaw moves the boundary so the pair stays together and
+the current unsummarized tail is preserved.
+
 The full conversation history stays on disk. Compaction only changes what the
 model sees on the next turn.
 
diff --git a/docs/concepts/typing-indicators.md b/docs/concepts/typing-indicators.md
index 084d44d9f0f..3155a30626e 100644
--- a/docs/concepts/typing-indicators.md
+++ b/docs/concepts/typing-indicators.md
@@ -59,8 +59,9 @@ You can override mode or cadence per session:
 
 ## Notes
 
-- `message` mode won’t show typing for silent-only replies (e.g. the `NO_REPLY`
-  token used to suppress output).
+- `message` mode won’t show typing for silent-only replies (for example
+  `NO_REPLY` / `no_reply`, which are treated case-insensitively for exact
+  silent-token suppression).
 - `thinking` only fires if the run streams reasoning (`reasoningLevel: "stream"`).
   If the model doesn’t emit reasoning deltas, typing won’t start.
 - Heartbeats never show typing, regardless of mode.
diff --git a/docs/reference/session-management-compaction.md b/docs/reference/session-management-compaction.md
index f394db5a170..da48bd5c6ae 100644
--- a/docs/reference/session-management-compaction.md
+++ b/docs/reference/session-management-compaction.md
@@ -210,6 +210,19 @@ After compaction, future turns see:
 
 Compaction is **persistent** (unlike session pruning). See [/concepts/session-pruning](/concepts/session-pruning).
 
+## Compaction chunk boundaries and tool pairing
+
+When OpenClaw splits a long transcript into compaction chunks, it keeps
+assistant tool calls paired with their matching `toolResult` entries.
+
+- If the token-share split lands between a tool call and its result, OpenClaw
+  shifts the boundary to the assistant tool-call message instead of separating
+  the pair.
+- If a trailing tool-result block would otherwise push the chunk over target,
+  OpenClaw preserves that pending tool block and keeps the unsummarized tail
+  intact.
+- Aborted/error tool-call blocks do not hold a pending split open.
+
 ---
 
 ## When auto-compaction happens (Pi runtime)
@@ -280,6 +293,8 @@ Convention:
 
 - The assistant starts its output with `NO_REPLY` to indicate “do not deliver a reply to the user”.
 - OpenClaw strips/suppresses this in the delivery layer.
+- Exact silent-token suppression is case-insensitive, so `NO_REPLY` and
+  `no_reply` both count when the whole payload is just the silent token.
 
 As of `2026.1.10`, OpenClaw also suppresses **draft/typing streaming** when a partial chunk begins with `NO_REPLY`, so silent operations don’t leak partial output mid-turn.
 
@@ -326,4 +341,4 @@ flush logic lives on the Gateway side today.
   - model context window (too small)
   - compaction settings (`reserveTokens` too high for the model window can cause earlier compaction)
   - tool-result bloat: enable/tune session pruning
-- Silent turns leaking? Confirm the reply starts with `NO_REPLY` (exact token) and you’re on a build that includes the streaming suppression fix.
+- Silent turns leaking? Confirm the reply starts with `NO_REPLY` (case-insensitive exact token) and you’re on a build that includes the streaming suppression fix.
diff --git a/docs/tools/thinking.md b/docs/tools/thinking.md
index e9b397ae4a3..196ded24f87 100644
--- a/docs/tools/thinking.md
+++ b/docs/tools/thinking.md
@@ -94,5 +94,9 @@ title: "Thinking Levels"
 ## Web chat UI
 
 - The web chat thinking selector mirrors the session's stored level from the inbound session store/config when the page loads.
-- Picking another level applies only to the next message (`thinkingOnce`); after sending, the selector snaps back to the stored session level.
-- To change the session default, send a `/think:<level>` directive (as before); the selector will reflect it after the next reload.
+- Picking another level writes the session override immediately via `sessions.patch`; it does not wait for the next send and it is not a one-shot `thinkingOnce` override.
+- The first option is always `Default (<resolved level>)`, where the resolved default comes from the active session model: `adaptive` for Claude 4.6 on Anthropic/Bedrock, `low` for other reasoning-capable models, `off` otherwise.
+- The picker stays provider-aware:
+  - most providers show `off | minimal | low | medium | high | adaptive`
+  - Z.AI shows binary `off | on`
+- `/think:<level>` still works and updates the same stored session level, so chat directives and the picker stay in sync.
diff --git a/docs/web/control-ui.md b/docs/web/control-ui.md
index 5d5aad79cc0..652f497a7be 100644
--- a/docs/web/control-ui.md
+++ b/docs/web/control-ui.md
@@ -83,7 +83,7 @@ The Control UI can localize itself on first load based on your browser locale, a
 - Stream tool calls + live tool output cards in Chat (agent events)
 - Channels: built-in plus bundled/external plugin channels status, QR login, and per-channel config (`channels.status`, `web.login.*`, `config.patch`)
 - Instances: presence list + refresh (`system-presence`)
-- Sessions: list + per-session thinking/fast/verbose/reasoning overrides (`sessions.list`, `sessions.patch`)
+- Sessions: list + per-session model/thinking/fast/verbose/reasoning overrides (`sessions.list`, `sessions.patch`)
 - Cron jobs: list/add/edit/run/enable/disable + run history (`cron.*`)
 - Skills: status, enable/disable, install, API key updates (`skills.*`)
 - Nodes: list + caps (`node.list`)
@@ -117,6 +117,7 @@ Cron jobs panel notes:
 - Re-sending with the same `idempotencyKey` returns `{ status: "in_flight" }` while running, and `{ status: "ok" }` after completion.
 - `chat.history` responses are size-bounded for UI safety. When transcript entries are too large, Gateway may truncate long text fields, omit heavy metadata blocks, and replace oversized messages with a placeholder (`[chat.history omitted: message too large]`).
 - `chat.inject` appends an assistant note to the session transcript and broadcasts a `chat` event for UI-only updates (no agent run, no channel delivery).
+- The chat header model and thinking pickers patch the active session immediately through `sessions.patch`; they are persistent session overrides, not one-turn-only send options.
 - Stop:
   - Click **Stop** (calls `chat.abort`)
   - Type `/stop` (or standalone abort phrases like `stop`, `stop action`, `stop run`, `stop openclaw`, `please stop`) to abort out-of-band