diff --git a/docs/concepts/compaction.md b/docs/concepts/compaction.md index d7ebc4504d4..e0da752b7a8 100644 --- a/docs/concepts/compaction.md +++ b/docs/concepts/compaction.md @@ -6,9 +6,7 @@ read_when: title: "Compaction" --- -Every model has a context window -- the maximum number of tokens it can process. -When a conversation approaches that limit, OpenClaw **compacts** older messages -into a summary so the chat can continue. +Every model has a context window: the maximum number of tokens it can process. When a conversation approaches that limit, OpenClaw **compacts** older messages into a summary so the chat can continue. ## How it works @@ -16,33 +14,53 @@ into a summary so the chat can continue. 2. The summary is saved in the session transcript. 3. Recent messages are kept intact. -When OpenClaw splits history into compaction chunks, it keeps assistant tool -calls paired with their matching `toolResult` entries. If a split point lands -inside a tool block, OpenClaw moves the boundary so the pair stays together and -the current unsummarized tail is preserved. +When OpenClaw splits history into compaction chunks, it keeps assistant tool calls paired with their matching `toolResult` entries. If a split point lands inside a tool block, OpenClaw moves the boundary so the pair stays together and the current unsummarized tail is preserved. -The full conversation history stays on disk. Compaction only changes what the -model sees on the next turn. +The full conversation history stays on disk. Compaction only changes what the model sees on the next turn. ## Auto-compaction -Auto-compaction is on by default. It runs when the session nears the context -limit, or when the model returns a context-overflow error (in which case -OpenClaw compacts and retries). Typical overflow signatures include -`request_too_large`, `context length exceeded`, `input exceeds the maximum -number of tokens`, `input token count exceeds the maximum number of input -tokens`, `input is too long for the model`, and `ollama error: context length -exceeded`. +Auto-compaction is on by default. It runs when the session nears the context limit, or when the model returns a context-overflow error (in which case OpenClaw compacts and retries). + +You will see: + +- `🧹 Auto-compaction complete` in verbose mode. +- `/status` showing `🧹 Compactions: `. -Before compacting, OpenClaw automatically reminds the agent to save important -notes to [memory](/concepts/memory) files. This prevents context loss. +Before compacting, OpenClaw automatically reminds the agent to save important notes to [memory](/concepts/memory) files. This prevents context loss. -Use the `agents.defaults.compaction` setting in your `openclaw.json` to configure compaction behavior (mode, target tokens, etc.). -Compaction summarization preserves opaque identifiers by default (`identifierPolicy: "strict"`). You can override this with `identifierPolicy: "off"` or provide custom text with `identifierPolicy: "custom"` and `identifierInstructions`. + + + OpenClaw detects context overflow from these provider error patterns: -You can optionally specify a different model for compaction summarization via `agents.defaults.compaction.model`. This is useful when your primary model is a local or small model and you want compaction summaries produced by a more capable model. The override accepts any `provider/model-id` string: + - `request_too_large` + - `context length exceeded` + - `input exceeds the maximum number of tokens` + - `input token count exceeds the maximum number of input tokens` + - `input is too long for the model` + - `ollama error: context length exceeded` + + + +## Manual compaction + +Type `/compact` in any chat to force a compaction. Add instructions to guide the summary: + +``` +/compact Focus on the API design decisions +``` + +When `agents.defaults.compaction.keepRecentTokens` is set, manual compaction honors that Pi cut-point and keeps the recent tail in rebuilt context. Without an explicit keep budget, manual compaction behaves as a hard checkpoint and continues from the new summary alone. + +## Configuration + +Configure compaction under `agents.defaults.compaction` in your `openclaw.json`. The most common knobs are listed below; for the full reference, see [Session management deep dive](/reference/session-management-compaction). + +### Using a different model + +By default, compaction uses the agent's primary model. Set `agents.defaults.compaction.model` to delegate summarization to a more capable or specialized model. The override accepts any `provider/model-id` string: ```json { @@ -56,7 +74,7 @@ You can optionally specify a different model for compaction summarization via `a } ``` -This also works with local models, for example a second Ollama model dedicated to summarization or a fine-tuned compaction specialist: +This works with local models too, for example a second Ollama model dedicated to summarization: ```json { @@ -70,91 +88,27 @@ This also works with local models, for example a second Ollama model dedicated t } ``` -When unset, compaction uses the agent’s primary model. +When unset, compaction uses the agent's primary model. -## Pluggable compaction providers +### Identifier preservation -Plugins can register a custom compaction provider via `registerCompactionProvider()` on the plugin API. When a provider is registered and configured, OpenClaw delegates summarization to it instead of the built-in LLM pipeline. +Compaction summarization preserves opaque identifiers by default (`identifierPolicy: "strict"`). Override with `identifierPolicy: "off"` to disable, or `identifierPolicy: "custom"` plus `identifierInstructions` for custom guidance. -To use a registered provider, set the provider id in your config: +### Active transcript byte guard -```json -{ - "agents": { - "defaults": { - "compaction": { - "provider": "my-provider" - } - } - } -} -``` +When `agents.defaults.compaction.maxActiveTranscriptBytes` is set, OpenClaw triggers normal local compaction before a run if the active JSONL reaches that size. This is useful for long-running sessions where provider-side context management may keep model context healthy while the local transcript keeps growing. It does not split raw JSONL bytes; it asks the normal compaction pipeline to create a semantic summary. -Setting a `provider` automatically forces `mode: "safeguard"`. Providers receive the same compaction instructions and identifier-preservation policy as the built-in path, and OpenClaw still preserves recent-turn and split-turn suffix context after provider output. If the provider fails or returns an empty result, OpenClaw falls back to built-in LLM summarization. + +The byte guard requires `truncateAfterCompaction: true`. Without transcript rotation, the active file would not shrink and the guard remains inactive. + -## Auto-compaction (default on) +### Successor transcripts -When a session nears or exceeds the model’s context window, OpenClaw triggers auto-compaction and may retry the original request using the compacted context. +When `agents.defaults.compaction.truncateAfterCompaction` is enabled, OpenClaw does not rewrite the existing transcript in place. It creates a new active successor transcript from the compaction summary, preserved state, and unsummarized tail, then keeps the previous JSONL as the archived checkpoint source. -You’ll see: +### Compaction notices -- `🧹 Auto-compaction complete` in verbose mode -- `/status` showing `🧹 Compactions: ` - -Before compaction, OpenClaw can run a **silent memory flush** turn to store -durable notes to disk. See [Memory](/concepts/memory) for details and config. - -## Manual compaction - -Type `/compact` in any chat to force a compaction. Add instructions to guide -the summary: - -``` -/compact Focus on the API design decisions -``` - -When `agents.defaults.compaction.keepRecentTokens` is set, manual compaction -honors that Pi cut-point and keeps the recent tail in rebuilt context. Without -an explicit keep budget, manual compaction behaves as a hard checkpoint and -continues from the new summary alone. - -When `agents.defaults.compaction.truncateAfterCompaction` is enabled, -OpenClaw does not rewrite the existing transcript in place. It creates a new -active successor transcript from the compaction summary, preserved state, and -unsummarized tail, then keeps the previous JSONL as the archived checkpoint -source. - -When `agents.defaults.compaction.maxActiveTranscriptBytes` is set, OpenClaw can -trigger normal local compaction before a run if the active JSONL reaches that -size. This is useful for long-running sessions where provider-side context -management may keep model context healthy while the local transcript keeps -growing. It does not split raw JSONL bytes; it only asks the normal compaction -pipeline to create a semantic summary. Combine it with -`truncateAfterCompaction: true` to move future turns onto the smaller successor -transcript; without transcript rotation, the byte guard remains inactive because -the active file would not shrink. - -## Using a different model - -By default, compaction uses your agent's primary model. You can use a more -capable model for better summaries: - -```json5 -{ - agents: { - defaults: { - compaction: { - model: "openrouter/anthropic/claude-sonnet-4-6", - }, - }, - }, -} -``` - -## Compaction notices - -By default, compaction runs silently. To show brief notices when compaction -starts and when it completes, enable `notifyUser`: +By default, compaction runs silently. Set `notifyUser` to show brief status messages when compaction starts and completes: ```json5 { @@ -168,8 +122,33 @@ starts and when it completes, enable `notifyUser`: } ``` -When enabled, the user sees short status messages around each compaction run -(for example, "Compacting context..." and "Compaction complete"). +### Memory flush + +Before compaction, OpenClaw can run a **silent memory flush** turn to store durable notes to disk. See [Memory](/concepts/memory) for details and config. + +## Pluggable compaction providers + +Plugins can register a custom compaction provider via `registerCompactionProvider()` on the plugin API. When a provider is registered and configured, OpenClaw delegates summarization to it instead of the built-in LLM pipeline. + +To use a registered provider, set its id in your config: + +```json +{ + "agents": { + "defaults": { + "compaction": { + "provider": "my-provider" + } + } + } +} +``` + +Setting a `provider` automatically forces `mode: "safeguard"`. Providers receive the same compaction instructions and identifier-preservation policy as the built-in path, and OpenClaw still preserves recent-turn and split-turn suffix context after provider output. + + +If the provider fails or returns an empty result, OpenClaw falls back to built-in LLM summarization. + ## Compaction vs pruning @@ -179,28 +158,21 @@ When enabled, the user sees short status messages around each compaction run | **Saved?** | Yes (in session transcript) | No (in-memory only, per request) | | **Scope** | Entire conversation | Tool results only | -[Session pruning](/concepts/session-pruning) is a lighter-weight complement that -trims tool output without summarizing. +[Session pruning](/concepts/session-pruning) is a lighter-weight complement that trims tool output without summarizing. ## Troubleshooting -**Compacting too often?** The model's context window may be small, or tool -outputs may be large. Try enabling -[session pruning](/concepts/session-pruning). +**Compacting too often?** The model's context window may be small, or tool outputs may be large. Try enabling [session pruning](/concepts/session-pruning). -**Context feels stale after compaction?** Use `/compact Focus on ` to -guide the summary, or enable the [memory flush](/concepts/memory) so notes -survive. +**Context feels stale after compaction?** Use `/compact Focus on ` to guide the summary, or enable the [memory flush](/concepts/memory) so notes survive. **Need a clean slate?** `/new` starts a fresh session without compacting. -For advanced configuration (reserve tokens, identifier preservation, custom -context engines, OpenAI server-side compaction), see the -[Session Management Deep Dive](/reference/session-management-compaction). +For advanced configuration (reserve tokens, identifier preservation, custom context engines, OpenAI server-side compaction), see the [Session management deep dive](/reference/session-management-compaction). ## Related -- [Session](/concepts/session) — session management and lifecycle -- [Session Pruning](/concepts/session-pruning) — trimming tool results -- [Context](/concepts/context) — how context is built for agent turns -- [Hooks](/automation/hooks) — compaction lifecycle hooks (before_compaction, after_compaction) +- [Session](/concepts/session): session management and lifecycle. +- [Session pruning](/concepts/session-pruning): trimming tool results. +- [Context](/concepts/context): how context is built for agent turns. +- [Hooks](/automation/hooks): compaction lifecycle hooks (`before_compaction`, `after_compaction`).