mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-21 03:04:47 +00:00
* refactor: remove stale file-backed shims * fix: harden sqlite state ci boundaries * refactor: store matrix idb snapshots in sqlite * fix: satisfy rebased CI guardrails * refactor: store current conversation bindings in sqlite table * refactor: store tui last sessions in sqlite table * refactor: reset sqlite schema history * refactor: drop unshipped sqlite table migration * refactor: remove plugin index file rollback * refactor: drop unshipped sqlite sidecar migrations * refactor: remove runtime commitments kv migration * refactor: preserve kysely sync result types * refactor: drop unshipped sqlite schema migration table * test: keep session usage coverage sqlite-backed * refactor: keep sqlite migration doctor-only * refactor: isolate device legacy imports * refactor: isolate push voicewake legacy imports * refactor: isolate remaining runtime legacy imports * refactor: tighten sqlite migration guardrails * test: cover sqlite persisted enum parsing * refactor: isolate legacy update and tui imports * refactor: tighten sqlite state ownership * refactor: move legacy imports behind doctor * refactor: remove legacy session row lookup * refactor: canonicalize memory transcript locators * refactor: drop transcript path scope fallbacks * refactor: drop runtime legacy session delivery pruning * refactor: store tts prefs only in sqlite * refactor: remove cron store path runtime * refactor: use cron sqlite store keys * refactor: rename telegram message cache scope * refactor: read memory dreaming status from sqlite * refactor: rename cron status store key * refactor: stop remembering transcript file paths * test: use sqlite locators in agent fixtures * refactor: remove file-shaped commitments and cron store surfaces * refactor: keep compaction transcript handles out of session rows * refactor: derive transcript handles from session identity * refactor: derive runtime transcript handles * refactor: remove gateway session locator reads * refactor: remove transcript locator from session rows * refactor: store raw stream diagnostics in sqlite * refactor: remove file-shaped transcript rotation * refactor: hide legacy trajectory paths from runtime * refactor: remove runtime transcript file bridges * refactor: repair database-first rebase fallout * refactor: align tests with database-first state * refactor: remove transcript file handoffs * refactor: sync post-compaction memory by transcript scope * refactor: run codex app-server sessions by id * refactor: bind codex runtime state by session id * refactor: pass memory transcripts by sqlite scope * refactor: remove transcript locator cleanup leftovers * test: remove stale transcript file fixtures * refactor: remove transcript locator test helper * test: make cron sqlite keys explicit * test: remove cron runtime store paths * test: remove stale session file fixtures * test: use sqlite cron keys in diagnostics * refactor: remove runtime delivery queue backfill * test: drop fake export session file mocks * refactor: rename acp session read failure flag * refactor: rename acp row session key * refactor: remove session store test seams * refactor: move legacy session parser tests to doctor * refactor: reindex managed memory in place * refactor: drop stale session store wording * refactor: rename session row helpers * refactor: rename sqlite session entry modules * refactor: remove transcript locator leftovers * refactor: trim file-era audit wording * refactor: clean managed media through sqlite * fix: prefer explicit agent for exports * fix: use prepared agent for session resets * fix: canonicalize legacy codex binding import * test: rename state cleanup helper * docs: align backup docs with sqlite state * refactor: drop legacy Pi usage auth fallback * refactor: move legacy auth profile imports to doctor * refactor: keep Pi model discovery auth in memory * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime * refactor: remove model json compatibility aliases * refactor: store auth profiles in sqlite * refactor: seed copied auth profiles in sqlite * refactor: make auth profile runtime sqlite-addressed * refactor: migrate hermes secrets into sqlite auth store * refactor: move plugin install config migration to doctor * refactor: rename plugin index audit checks * test: drop auth file assumptions * test: remove legacy transcript file assertions * refactor: drop legacy cli session aliases * refactor: store skill uploads in sqlite * refactor: keep subagent attachments in sqlite vfs * refactor: drop subagent attachment cleanup state * refactor: move legacy session aliases to doctor * refactor: require node 24 for sqlite state runtime * refactor: move provider caches into sqlite state * fix: harden virtual agent filesystem * refactor: enforce database-first runtime state * refactor: rename compaction transcript rotation setting * test: clean sqlite refactor test types * refactor: consolidate sqlite runtime state * refactor: model session conversations in sqlite * refactor: stop deriving cron delivery from session keys * refactor: stop classifying sessions from key shape * refactor: hydrate announce targets from typed delivery * refactor: route heartbeat delivery from typed sqlite context * refactor: tighten typed sqlite session routing * refactor: remove session origin routing shadow * refactor: drop session origin shadow fixtures * perf: query sqlite vfs paths by prefix * refactor: use typed conversation metadata for sessions * refactor: prefer typed session routing metadata * refactor: require typed session routing metadata * refactor: resolve group tool policy from typed sessions * refactor: delete dead session thread info bridge * Show Codex subscription reset times in channel errors (#80456) * feat(plugin-sdk): consolidate session workflow APIs * fix(agents): allow read-only agent mount reads * [codex] refresh plugin regression fixtures * fix(agents): restore compaction gateway logs * test: tighten gateway startup assertions * Redact persisted secret-shaped payloads [AI] (#79006) * test: tighten device pair notify assertions * test: tighten hermes secret assertions * test: assert matrix client error shapes * test: assert config compat warnings * fix(heartbeat): remap cron-run exec events to session keys (#80214) * fix(codex): route btw through native side threads * fix(auth): accept friendly OpenAI order for Codex profiles * fix(codex): rotate auth profiles inside harness * fix: keep browser status page probe within timeout * test: assert agents add outputs * test: pin cron read status * fix(agents): avoid Pi resource discovery stalls Co-authored-by: dataCenter430 <titan032000@gmail.com> * fix: retire timed-out codex app-server clients * test: tighten qa lab runtime assertions * test: check security fix outputs * test: verify extension runtime messages * feat(wake): expose typed sessionKey on wake protocol + system event CLI * fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790) * test: guard talk consult call helper * fix(codex): scale context engine projection (#80761) * fix(codex): scale context engine projection * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * chore: align Codex projection changelog * chore: realign Codex projection changelog * fix: isolate Codex projection patch --------- Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> * refactor: move agent runtime state toward piless * refactor: remove cron session reaper * refactor: move session management to sqlite * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: remove stale file-backed shims * test: harden kysely type coverage # Conflicts: # .agents/skills/kysely-database-access/SKILL.md # src/infra/kysely-sync.types.test.ts # src/proxy-capture/store.sqlite.test.ts # src/state/openclaw-agent-db.test.ts # src/state/openclaw-state-db.test.ts * refactor: remove cron store path runtime * refactor: keep compaction transcript handles out of session rows * refactor: derive embedded transcripts from sqlite identity * refactor: remove embedded transcript locator handoff * refactor: remove runtime transcript file bridges * refactor: remove transcript file handoffs * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime # Conflicts: # docs/cli/secrets.md # docs/gateway/authentication.md # docs/gateway/secrets.md * fix: keep oauth sibling sync sqlite-local # Conflicts: # src/commands/onboard-auth.test.ts * refactor: remove task session store maintenance # Conflicts: # src/commands/tasks.ts * refactor: keep diagnostics in state sqlite * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * Show Codex subscription reset times in channel errors (#80456) * fix(codex): refresh subscription limit resets * fix(codex): format reset times for channels * Update CHANGELOG with latest changes and fixes Updated CHANGELOG with recent fixes and improvements. * fix(codex): keep command load failures on codex surface * fix(codex): format account rate limits as rows * fix(codex): summarize account limits as usage status * fix(codex): simplify account limit status * test: tighten subagent announce queue assertion * test: tighten session delete lifecycle assertions * test: tighten cron ops assertions * fix: track cron execution milestones * test: tighten hermes secret assertions * test: assert matrix sync store payloads * test: assert config compat warnings * fix(codex): align btw side thread semantics * fix(codex): honor codex fallback blocking * fix(agents): avoid Pi resource discovery stalls * test: tighten codex event assertions * test: tighten cron assertions * Fix Codex app-server OAuth harness auth * refactor: move agent runtime state toward piless * refactor: move device and push state to sqlite * refactor: move runtime json state imports to doctor * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: clarify cron sqlite store keys * refactor: remove stale file-backed shims * refactor: bind codex runtime state by session id * test: expect sqlite trajectory branch export * refactor: rename session row helpers * fix: keep legacy device identity import in doctor * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * build: align pi contract wrappers * chore: repair database-first rebase * refactor: remove session file test contracts * test: update gateway session expectations * refactor: stop routing from session compatibility shadows * refactor: stop persisting session route shadows * refactor: use typed delivery context in clients * refactor: stop echoing session route shadows * refactor: repair embedded runner rebase imports # Conflicts: # src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts * refactor: align pi contract imports * refactor: satisfy kysely sync helper guard * refactor: remove file transcript bridge remnants * refactor: remove session locator compatibility * refactor: remove session file test contracts * refactor: keep rebase database-first clean * refactor: remove session file assumptions from e2e * docs: clarify database-first goal state * test: remove legacy store markers from sqlite runtime tests * refactor: remove legacy store assumptions from runtime seams * refactor: align sqlite runtime helper seams * test: update memory recall sqlite audit mock * refactor: align database-first runtime type seams * test: clarify doctor cron legacy store names * fix: preserve sqlite session route projections * test: fix copilot token cache test syntax * docs: update database-first proof status * test: align database-first test fixtures * docs: update database-first proof status * refactor: clean extension database-first drift * test: align agent session route proof * test: clarify doctor legacy path fixtures * chore: clean database-first changed checks * chore: repair database-first rebase markers * build: allow baileys git subdependency * chore: repair exp-vfs rebase drift * chore: finish exp-vfs rebase cleanup * chore: satisfy rebase lint drift * chore: fix qqbot rebase type seam * chore: fix rebase drift leftovers * fix: keep auth profile oauth secrets out of sqlite * fix: repair rebase drift tests * test: stabilize pairing request ordering * test: use source manifests in plugin contract checks * fix: restore gateway session metadata after rebase * fix: repair database-first rebase drift * fix: clean up database-first rebase fallout * test: stabilize line quick reply receipt time * fix: repair extension rebase drift * test: keep transcript redaction tests sqlite-backed * fix: carry injected transcript redaction through sqlite * chore: clean database branch rebase residue * fix: repair database branch CI drift * fix: repair database branch CI guard drift * fix: stabilize oauth tls preflight test * test: align database branch fast guards * test: repair build artifact boundary guards * chore: clean changelog rebase markers --------- Co-authored-by: pashpashpash <nik@vault77.ai> Co-authored-by: Eva <eva@100yen.org> Co-authored-by: stainlu <stainlu@newtype-ai.org> Co-authored-by: Jason Zhou <jason.zhou.design@gmail.com> Co-authored-by: Ruben Cuevas <hi@rubencu.com> Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com> Co-authored-by: Shakker <shakkerdroid@gmail.com> Co-authored-by: Kaspre <36520309+Kaspre@users.noreply.github.com> Co-authored-by: dataCenter430 <titan032000@gmail.com> Co-authored-by: Kaspre <kaspre@gmail.com> Co-authored-by: pandadev66 <nova.full.stack@outlook.com> Co-authored-by: Eva <admin@100yen.org> Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> Co-authored-by: jeffjhunter <support@aipersonamethod.com>
234 lines
10 KiB
Markdown
234 lines
10 KiB
Markdown
---
|
|
summary: "How OpenClaw builds prompt context and reports token usage + costs"
|
|
read_when:
|
|
- Explaining token usage, costs, or context windows
|
|
- Debugging context growth or compaction behavior
|
|
title: "Token use and costs"
|
|
---
|
|
|
|
OpenClaw tracks **tokens**, not characters. Tokens are model-specific, but most
|
|
OpenAI-style models average ~4 characters per token for English text.
|
|
|
|
## How the system prompt is built
|
|
|
|
OpenClaw assembles its own system prompt on every run. It includes:
|
|
|
|
- Tool list + short descriptions
|
|
- Skills list (only metadata; instructions are loaded on demand with `read`).
|
|
The compact skills block is bounded by `skills.limits.maxSkillsPromptChars`,
|
|
with optional per-agent override at
|
|
`agents.list[].skillsLimits.maxSkillsPromptChars`.
|
|
- Self-update instructions
|
|
- Workspace + bootstrap files (`AGENTS.md`, `SOUL.md`, `TOOLS.md`, `IDENTITY.md`, `USER.md`, `HEARTBEAT.md`, `BOOTSTRAP.md` when new, plus `MEMORY.md` when present). Lowercase root `memory.md` is not injected; it is legacy repair input for `openclaw doctor --fix` when paired with `MEMORY.md`. Large files are truncated by `agents.defaults.bootstrapMaxChars` (default: 12000), and total bootstrap injection is capped by `agents.defaults.bootstrapTotalMaxChars` (default: 60000). `memory/*.md` daily files are not part of the normal bootstrap prompt; they remain on-demand via memory tools on ordinary turns, but reset/startup model runs can prepend a one-shot startup-context block with recent daily memory for that first turn. Bare chat `/new` and `/reset` commands are acknowledged without invoking the model. The startup prelude is controlled by `agents.defaults.startupContext`.
|
|
- Time (UTC + user timezone)
|
|
- Reply tags + heartbeat behavior
|
|
- Runtime metadata (host/OS/model/thinking)
|
|
|
|
See the full breakdown in [System Prompt](/concepts/system-prompt).
|
|
|
|
## What counts in the context window
|
|
|
|
Everything the model receives counts toward the context limit:
|
|
|
|
- System prompt (all sections listed above)
|
|
- Conversation history (user + assistant messages)
|
|
- Tool calls and tool results
|
|
- Attachments/transcripts (images, audio, files)
|
|
- Compaction summaries and pruning artifacts
|
|
- Provider wrappers or safety headers (not visible, but still counted)
|
|
|
|
Some runtime-heavy surfaces have their own explicit caps:
|
|
|
|
- `agents.defaults.contextLimits.memoryGetMaxChars`
|
|
- `agents.defaults.contextLimits.memoryGetDefaultLines`
|
|
- `agents.defaults.contextLimits.toolResultMaxChars`
|
|
- `agents.defaults.contextLimits.postCompactionMaxChars`
|
|
|
|
Per-agent overrides live under `agents.list[].contextLimits`. These knobs are
|
|
for bounded runtime excerpts and injected runtime-owned blocks. They are
|
|
separate from bootstrap limits, startup-context limits, and skills prompt
|
|
limits.
|
|
|
|
For images, OpenClaw downscales transcript/tool image payloads before provider calls.
|
|
Use `agents.defaults.imageMaxDimensionPx` (default: `1200`) to tune this:
|
|
|
|
- Lower values usually reduce vision-token usage and payload size.
|
|
- Higher values preserve more visual detail for OCR/UI-heavy screenshots.
|
|
|
|
For a practical breakdown (per injected file, tools, skills, and system prompt size), use `/context list` or `/context detail`. See [Context](/concepts/context).
|
|
|
|
## How to see current token usage
|
|
|
|
Use these in chat:
|
|
|
|
- `/status` → **emoji-rich status card** with the session model, context usage,
|
|
last response input/output tokens, and **estimated cost** (API key only).
|
|
- `/usage off|tokens|full` → appends a **per-response usage footer** to every reply.
|
|
- Persists per session (stored as `responseUsage`).
|
|
- OAuth auth **hides cost** (tokens only).
|
|
- `/usage cost` → shows a local cost summary from OpenClaw session transcripts.
|
|
|
|
Other surfaces:
|
|
|
|
- **TUI/Web TUI:** `/status` + `/usage` are supported.
|
|
- **CLI:** `openclaw status --usage` and `openclaw channels list` show
|
|
normalized provider quota windows (`X% left`, not per-response costs).
|
|
Current usage-window providers: Anthropic, GitHub Copilot, Gemini CLI,
|
|
OpenAI Codex, MiniMax, Xiaomi, and z.ai.
|
|
|
|
Usage surfaces normalize common provider-native field aliases before display.
|
|
For OpenAI-family Responses traffic, that includes both `input_tokens` /
|
|
`output_tokens` and `prompt_tokens` / `completion_tokens`, so transport-specific
|
|
field names do not change `/status`, `/usage`, or session summaries.
|
|
Gemini CLI JSON usage is normalized too: reply text comes from `response`, and
|
|
`stats.cached` maps to `cacheRead` with `stats.input_tokens - stats.cached`
|
|
used when the CLI omits an explicit `stats.input` field.
|
|
For native OpenAI-family Responses traffic, WebSocket/SSE usage aliases are
|
|
normalized the same way, and totals fall back to normalized input + output when
|
|
`total_tokens` is missing or `0`.
|
|
When the current session snapshot is sparse, `/status` and `session_status` can
|
|
also recover token/cache counters and the active runtime model label from the
|
|
most recent transcript usage log. Existing nonzero live values still take
|
|
precedence over transcript fallback values, and larger prompt-oriented
|
|
transcript totals can win when stored totals are missing or smaller.
|
|
Usage auth for provider quota windows comes from provider-specific hooks when
|
|
available; otherwise OpenClaw falls back to matching OAuth/API-key credentials
|
|
from auth profiles, env, or config.
|
|
Assistant transcript entries persist the same normalized usage shape, including
|
|
`usage.cost` when the active model has pricing configured and the provider
|
|
returns usage metadata. This gives `/usage cost` and transcript-backed session
|
|
status a stable source even after the live runtime state is gone.
|
|
|
|
OpenClaw keeps provider usage accounting separate from the current context
|
|
snapshot. Provider `usage.total` can include cached input, output, and multiple
|
|
tool-loop model calls, so it is useful for cost and telemetry but can overstate
|
|
the live context window. Context displays and diagnostics use the latest prompt
|
|
snapshot (`promptTokens`, or the last model call when no prompt snapshot is
|
|
available) for `context.used`.
|
|
|
|
## Cost estimation (when shown)
|
|
|
|
Costs are estimated from your model pricing config:
|
|
|
|
```
|
|
models.providers.<provider>.models[].cost
|
|
```
|
|
|
|
These are **USD per 1M tokens** for `input`, `output`, `cacheRead`, and
|
|
`cacheWrite`. If pricing is missing, OpenClaw shows tokens only. OAuth tokens
|
|
never show dollar cost.
|
|
|
|
After sidecars and channels reach the Gateway ready path, OpenClaw starts an
|
|
optional background pricing bootstrap for configured model refs that do not
|
|
already have local pricing. That bootstrap fetches remote OpenRouter and LiteLLM
|
|
pricing catalogs. Set `models.pricing.enabled: false` to skip those catalog
|
|
fetches on offline or restricted networks; explicit
|
|
`models.providers.*.models[].cost` entries continue to drive local cost
|
|
estimates.
|
|
|
|
## Cache TTL and pruning impact
|
|
|
|
Provider prompt caching only applies within the cache TTL window. OpenClaw can
|
|
optionally run **cache-ttl pruning**: it prunes the session once the cache TTL
|
|
has expired, then resets the cache window so subsequent requests can re-use the
|
|
freshly cached context instead of re-caching the full history. This keeps cache
|
|
write costs lower when a session goes idle past the TTL.
|
|
|
|
Configure it in [Gateway configuration](/gateway/configuration) and see the
|
|
behavior details in [Session pruning](/concepts/session-pruning).
|
|
|
|
Heartbeat can keep the cache **warm** across idle gaps. If your model cache TTL
|
|
is `1h`, setting the heartbeat interval just under that (e.g., `55m`) can avoid
|
|
re-caching the full prompt, reducing cache write costs.
|
|
|
|
In multi-agent setups, you can keep one shared model config and tune cache behavior
|
|
per agent with `agents.list[].params.cacheRetention`.
|
|
|
|
For a full knob-by-knob guide, see [Prompt Caching](/reference/prompt-caching).
|
|
|
|
For Anthropic API pricing, cache reads are significantly cheaper than input
|
|
tokens, while cache writes are billed at a higher multiplier. See Anthropic's
|
|
prompt caching pricing for the latest rates and TTL multipliers:
|
|
[https://docs.anthropic.com/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/docs/build-with-claude/prompt-caching)
|
|
|
|
### Example: keep 1h cache warm with heartbeat
|
|
|
|
```yaml
|
|
agents:
|
|
defaults:
|
|
model:
|
|
primary: "anthropic/claude-opus-4-6"
|
|
models:
|
|
"anthropic/claude-opus-4-6":
|
|
params:
|
|
cacheRetention: "long"
|
|
heartbeat:
|
|
every: "55m"
|
|
```
|
|
|
|
### Example: mixed traffic with per-agent cache strategy
|
|
|
|
```yaml
|
|
agents:
|
|
defaults:
|
|
model:
|
|
primary: "anthropic/claude-opus-4-6"
|
|
models:
|
|
"anthropic/claude-opus-4-6":
|
|
params:
|
|
cacheRetention: "long" # default baseline for most agents
|
|
list:
|
|
- id: "research"
|
|
default: true
|
|
heartbeat:
|
|
every: "55m" # keep long cache warm for deep sessions
|
|
- id: "alerts"
|
|
params:
|
|
cacheRetention: "none" # avoid cache writes for bursty notifications
|
|
```
|
|
|
|
`agents.list[].params` merges on top of the selected model's `params`, so you can
|
|
override only `cacheRetention` and inherit other model defaults unchanged.
|
|
|
|
### Example: enable Anthropic 1M context beta header
|
|
|
|
Anthropic's 1M context window is currently beta-gated. OpenClaw can inject the
|
|
required `anthropic-beta` value when you enable `context1m` on supported Opus
|
|
or Sonnet models.
|
|
|
|
```yaml
|
|
agents:
|
|
defaults:
|
|
models:
|
|
"anthropic/claude-opus-4-6":
|
|
params:
|
|
context1m: true
|
|
```
|
|
|
|
This maps to Anthropic's `context-1m-2025-08-07` beta header.
|
|
|
|
This only applies when `context1m: true` is set on that model entry.
|
|
|
|
Requirement: the credential must be eligible for long-context usage. If not,
|
|
Anthropic responds with a provider-side rate limit error for that request.
|
|
|
|
If you authenticate Anthropic with OAuth/subscription tokens (`sk-ant-oat-*`),
|
|
OpenClaw skips the `context-1m-*` beta header because Anthropic currently
|
|
rejects that combination with HTTP 401.
|
|
|
|
## Tips for reducing token pressure
|
|
|
|
- Use `/compact` to summarize long sessions.
|
|
- Trim large tool outputs in your workflows.
|
|
- Lower `agents.defaults.imageMaxDimensionPx` for screenshot-heavy sessions.
|
|
- Keep skill descriptions short (skill list is injected into the prompt).
|
|
- Prefer smaller models for verbose, exploratory work.
|
|
|
|
See [Skills](/tools/skills) for the exact skill list overhead formula.
|
|
|
|
## Related
|
|
|
|
- [API usage and costs](/reference/api-usage-costs)
|
|
- [Prompt caching](/reference/prompt-caching)
|
|
- [Usage tracking](/concepts/usage-tracking)
|