Files
openclaw/docs/concepts/compaction.md
Peter Steinberger bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341)
* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types
2026-05-27 19:24:04 +01:00

8.4 KiB

summary, read_when, title
summary read_when title
How OpenClaw summarizes long conversations to stay within model limits
You want to understand auto-compaction and /compact
You are debugging long sessions hitting context limits
Compaction

Every model has a context window: the maximum number of tokens it can process. When a conversation approaches that limit, OpenClaw compacts older messages into a summary so the chat can continue.

How it works

  1. Older conversation turns are summarized into a compact entry.
  2. The summary is saved in the session transcript.
  3. Recent messages are kept intact.

When OpenClaw splits history into compaction chunks, it keeps assistant tool calls paired with their matching toolResult entries. If a split point lands inside a tool block, OpenClaw moves the boundary so the pair stays together and the current unsummarized tail is preserved.

The full conversation history stays on disk. Compaction only changes what the model sees on the next turn.

Auto-compaction

Auto-compaction is on by default. It runs when the session nears the context limit, or when the model returns a context-overflow error (in which case OpenClaw compacts and retries).

You will see:

  • embedded run auto-compaction start / complete in normal Gateway logs.
  • 🧹 Auto-compaction complete in verbose mode.
  • /status showing 🧹 Compactions: <count>.
Before compacting, OpenClaw automatically reminds the agent to save important notes to [memory](/concepts/memory) files. This prevents context loss. OpenClaw detects context overflow from these provider error patterns:
- `request_too_large`
- `context length exceeded`
- `input exceeds the maximum number of tokens`
- `input token count exceeds the maximum number of input tokens`
- `input is too long for the model`
- `ollama error: context length exceeded`

Manual compaction

Type /compact in any chat to force a compaction. Add instructions to guide the summary:

/compact Focus on the API design decisions

When agents.defaults.compaction.keepRecentTokens is set, manual compaction honors that OpenClaw cut-point and keeps the recent tail in rebuilt context. Without an explicit keep budget, manual compaction behaves as a hard checkpoint and continues from the new summary alone.

Configuration

Configure compaction under agents.defaults.compaction in your openclaw.json. The most common knobs are listed below; for the full reference, see Session management deep dive.

Using a different model

By default, compaction uses the agent's primary model. Set agents.defaults.compaction.model to delegate summarization to a more capable or specialized model. The override accepts any provider/model-id string:

{
  "agents": {
    "defaults": {
      "compaction": {
        "model": "openrouter/anthropic/claude-sonnet-4-6"
      }
    }
  }
}

This works with local models too, for example a second Ollama model dedicated to summarization:

{
  "agents": {
    "defaults": {
      "compaction": {
        "model": "ollama/llama3.1:8b"
      }
    }
  }
}

When unset, compaction starts with the active session model. If summarization fails with a model-fallback-eligible provider error, OpenClaw retries that compaction attempt through the session's existing model fallback chain. The fallback choice is temporary and is not written back to session state. An explicit agents.defaults.compaction.model override remains exact and does not inherit the session fallback chain.

Identifier preservation

Compaction summarization preserves opaque identifiers by default (identifierPolicy: "strict"). Override with identifierPolicy: "off" to disable, or identifierPolicy: "custom" plus identifierInstructions for custom guidance.

Active transcript byte guard

When agents.defaults.compaction.maxActiveTranscriptBytes is set, OpenClaw triggers normal local compaction before a run if the active JSONL reaches that size. This is useful for long-running sessions where provider-side context management may keep model context healthy while the local transcript keeps growing. It does not split raw JSONL bytes; it asks the normal compaction pipeline to create a semantic summary.

The byte guard requires `truncateAfterCompaction: true`. Without transcript rotation, the active file would not shrink and the guard remains inactive.

Successor transcripts

When agents.defaults.compaction.truncateAfterCompaction is enabled, OpenClaw does not rewrite the existing transcript in place. It creates a new active successor transcript from the compaction summary, preserved state, and unsummarized tail, then records checkpoint metadata that points branch/restore flows at that compacted successor. Successor transcripts also drop exact duplicate long user turns that arrive inside a short retry window, so channel retry storms are not carried into the next active transcript after compaction.

OpenClaw no longer writes separate .checkpoint.*.jsonl copies for new compactions. Existing legacy checkpoint files can still be used while referenced and are pruned by normal session cleanup.

Compaction notices

By default, compaction runs silently. Set notifyUser to show brief status messages when compaction starts and completes:

{
  agents: {
    defaults: {
      compaction: {
        notifyUser: true,
      },
    },
  },
}

Memory flush

Before compaction, OpenClaw can run a silent memory flush turn to store durable notes to disk. Set agents.defaults.compaction.memoryFlush.model when this housekeeping turn should use a local model instead of the active conversation model:

{
  "agents": {
    "defaults": {
      "compaction": {
        "memoryFlush": {
          "model": "ollama/qwen3:8b"
        }
      }
    }
  }
}

The memory-flush model override is exact and does not inherit the active session fallback chain. See Memory for details and config.

Pluggable compaction providers

Plugins can register a custom compaction provider via registerCompactionProvider() on the plugin API. When a provider is registered and configured, OpenClaw delegates summarization to it instead of the built-in LLM pipeline.

To use a registered provider, set its id in your config:

{
  "agents": {
    "defaults": {
      "compaction": {
        "provider": "my-provider"
      }
    }
  }
}

Setting a provider automatically forces mode: "safeguard". Providers receive the same compaction instructions and identifier-preservation policy as the built-in path, and OpenClaw still preserves recent-turn and split-turn suffix context after provider output.

If the provider fails or returns an empty result, OpenClaw falls back to built-in LLM summarization.

Compaction vs pruning

Compaction Pruning
What it does Summarizes older conversation Trims old tool results
Saved? Yes (in session transcript) No (in-memory only, per request)
Scope Entire conversation Tool results only

Session pruning is a lighter-weight complement that trims tool output without summarizing.

Troubleshooting

Compacting too often? The model's context window may be small, or tool outputs may be large. Try enabling session pruning.

Context feels stale after compaction? Use /compact Focus on <topic> to guide the summary, or enable the memory flush so notes survive.

Need a clean slate? /new starts a fresh session without compacting.

For advanced configuration (reserve tokens, identifier preservation, custom context engines, OpenAI server-side compaction), see the Session management deep dive.

  • Session: session management and lifecycle.
  • Session pruning: trimming tool results.
  • Context: how context is built for agent turns.
  • Hooks: compaction lifecycle hooks (before_compaction, after_compaction).