openclaw/docs/concepts/agent-loop.md at 3de5979bdc8e4e9e9d3fee446eaab53cad2ff605

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-20 00:34:46 +00:00

Files

Peter Steinberger f91de52f0d refactor: move runtime state to SQLite

* refactor: remove stale file-backed shims

* fix: harden sqlite state ci boundaries

* refactor: store matrix idb snapshots in sqlite

* fix: satisfy rebased CI guardrails

* refactor: store current conversation bindings in sqlite table

* refactor: store tui last sessions in sqlite table

* refactor: reset sqlite schema history

* refactor: drop unshipped sqlite table migration

* refactor: remove plugin index file rollback

* refactor: drop unshipped sqlite sidecar migrations

* refactor: remove runtime commitments kv migration

* refactor: preserve kysely sync result types

* refactor: drop unshipped sqlite schema migration table

* test: keep session usage coverage sqlite-backed

* refactor: keep sqlite migration doctor-only

* refactor: isolate device legacy imports

* refactor: isolate push voicewake legacy imports

* refactor: isolate remaining runtime legacy imports

* refactor: tighten sqlite migration guardrails

* test: cover sqlite persisted enum parsing

* refactor: isolate legacy update and tui imports

* refactor: tighten sqlite state ownership

* refactor: move legacy imports behind doctor

* refactor: remove legacy session row lookup

* refactor: canonicalize memory transcript locators

* refactor: drop transcript path scope fallbacks

* refactor: drop runtime legacy session delivery pruning

* refactor: store tts prefs only in sqlite

* refactor: remove cron store path runtime

* refactor: use cron sqlite store keys

* refactor: rename telegram message cache scope

* refactor: read memory dreaming status from sqlite

* refactor: rename cron status store key

* refactor: stop remembering transcript file paths

* test: use sqlite locators in agent fixtures

* refactor: remove file-shaped commitments and cron store surfaces

* refactor: keep compaction transcript handles out of session rows

* refactor: derive transcript handles from session identity

* refactor: derive runtime transcript handles

* refactor: remove gateway session locator reads

* refactor: remove transcript locator from session rows

* refactor: store raw stream diagnostics in sqlite

* refactor: remove file-shaped transcript rotation

* refactor: hide legacy trajectory paths from runtime

* refactor: remove runtime transcript file bridges

* refactor: repair database-first rebase fallout

* refactor: align tests with database-first state

* refactor: remove transcript file handoffs

* refactor: sync post-compaction memory by transcript scope

* refactor: run codex app-server sessions by id

* refactor: bind codex runtime state by session id

* refactor: pass memory transcripts by sqlite scope

* refactor: remove transcript locator cleanup leftovers

* test: remove stale transcript file fixtures

* refactor: remove transcript locator test helper

* test: make cron sqlite keys explicit

* test: remove cron runtime store paths

* test: remove stale session file fixtures

* test: use sqlite cron keys in diagnostics

* refactor: remove runtime delivery queue backfill

* test: drop fake export session file mocks

* refactor: rename acp session read failure flag

* refactor: rename acp row session key

* refactor: remove session store test seams

* refactor: move legacy session parser tests to doctor

* refactor: reindex managed memory in place

* refactor: drop stale session store wording

* refactor: rename session row helpers

* refactor: rename sqlite session entry modules

* refactor: remove transcript locator leftovers

* refactor: trim file-era audit wording

* refactor: clean managed media through sqlite

* fix: prefer explicit agent for exports

* fix: use prepared agent for session resets

* fix: canonicalize legacy codex binding import

* test: rename state cleanup helper

* docs: align backup docs with sqlite state

* refactor: drop legacy Pi usage auth fallback

* refactor: move legacy auth profile imports to doctor

* refactor: keep Pi model discovery auth in memory

* refactor: remove MSTeams legacy learning key fallback

* refactor: store model catalog config in sqlite

* refactor: use sqlite model catalog at runtime

* refactor: remove model json compatibility aliases

* refactor: store auth profiles in sqlite

* refactor: seed copied auth profiles in sqlite

* refactor: make auth profile runtime sqlite-addressed

* refactor: migrate hermes secrets into sqlite auth store

* refactor: move plugin install config migration to doctor

* refactor: rename plugin index audit checks

* test: drop auth file assumptions

* test: remove legacy transcript file assertions

* refactor: drop legacy cli session aliases

* refactor: store skill uploads in sqlite

* refactor: keep subagent attachments in sqlite vfs

* refactor: drop subagent attachment cleanup state

* refactor: move legacy session aliases to doctor

* refactor: require node 24 for sqlite state runtime

* refactor: move provider caches into sqlite state

* fix: harden virtual agent filesystem

* refactor: enforce database-first runtime state

* refactor: rename compaction transcript rotation setting

* test: clean sqlite refactor test types

* refactor: consolidate sqlite runtime state

* refactor: model session conversations in sqlite

* refactor: stop deriving cron delivery from session keys

* refactor: stop classifying sessions from key shape

* refactor: hydrate announce targets from typed delivery

* refactor: route heartbeat delivery from typed sqlite context

* refactor: tighten typed sqlite session routing

* refactor: remove session origin routing shadow

* refactor: drop session origin shadow fixtures

* perf: query sqlite vfs paths by prefix

* refactor: use typed conversation metadata for sessions

* refactor: prefer typed session routing metadata

* refactor: require typed session routing metadata

* refactor: resolve group tool policy from typed sessions

* refactor: delete dead session thread info bridge

* Show Codex subscription reset times in channel errors (#80456)

* feat(plugin-sdk): consolidate session workflow APIs

* fix(agents): allow read-only agent mount reads

* [codex] refresh plugin regression fixtures

* fix(agents): restore compaction gateway logs

* test: tighten gateway startup assertions

* Redact persisted secret-shaped payloads [AI] (#79006)

* test: tighten device pair notify assertions

* test: tighten hermes secret assertions

* test: assert matrix client error shapes

* test: assert config compat warnings

* fix(heartbeat): remap cron-run exec events to session keys (#80214)

* fix(codex): route btw through native side threads

* fix(auth): accept friendly OpenAI order for Codex profiles

* fix(codex): rotate auth profiles inside harness

* fix: keep browser status page probe within timeout

* test: assert agents add outputs

* test: pin cron read status

* fix(agents): avoid Pi resource discovery stalls

Co-authored-by: dataCenter430 <titan032000@gmail.com>

* fix: retire timed-out codex app-server clients

* test: tighten qa lab runtime assertions

* test: check security fix outputs

* test: verify extension runtime messages

* feat(wake): expose typed sessionKey on wake protocol + system event CLI

* fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790)

* test: guard talk consult call helper

* fix(codex): scale context engine projection (#80761)

* fix(codex): scale context engine projection

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* chore: align Codex projection changelog

* chore: realign Codex projection changelog

* fix: isolate Codex projection patch

---------

Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>

* refactor: move agent runtime state toward piless

* refactor: remove cron session reaper

* refactor: move session management to sqlite

* refactor: finish database-first state migration

* chore: refresh generated sqlite db types

* refactor: remove stale file-backed shims

* test: harden kysely type coverage

# Conflicts:
#	.agents/skills/kysely-database-access/SKILL.md
#	src/infra/kysely-sync.types.test.ts
#	src/proxy-capture/store.sqlite.test.ts
#	src/state/openclaw-agent-db.test.ts
#	src/state/openclaw-state-db.test.ts

* refactor: remove cron store path runtime

* refactor: keep compaction transcript handles out of session rows

* refactor: derive embedded transcripts from sqlite identity

* refactor: remove embedded transcript locator handoff

* refactor: remove runtime transcript file bridges

* refactor: remove transcript file handoffs

* refactor: remove MSTeams legacy learning key fallback

* refactor: store model catalog config in sqlite

* refactor: use sqlite model catalog at runtime

# Conflicts:
#	docs/cli/secrets.md
#	docs/gateway/authentication.md
#	docs/gateway/secrets.md

* fix: keep oauth sibling sync sqlite-local

# Conflicts:
#	src/commands/onboard-auth.test.ts

* refactor: remove task session store maintenance

# Conflicts:
#	src/commands/tasks.ts

* refactor: keep diagnostics in state sqlite

* refactor: enforce database-first runtime state

* refactor: consolidate sqlite runtime state

* Show Codex subscription reset times in channel errors (#80456)

* fix(codex): refresh subscription limit resets

* fix(codex): format reset times for channels

* Update CHANGELOG with latest changes and fixes

Updated CHANGELOG with recent fixes and improvements.

* fix(codex): keep command load failures on codex surface

* fix(codex): format account rate limits as rows

* fix(codex): summarize account limits as usage status

* fix(codex): simplify account limit status

* test: tighten subagent announce queue assertion

* test: tighten session delete lifecycle assertions

* test: tighten cron ops assertions

* fix: track cron execution milestones

* test: tighten hermes secret assertions

* test: assert matrix sync store payloads

* test: assert config compat warnings

* fix(codex): align btw side thread semantics

* fix(codex): honor codex fallback blocking

* fix(agents): avoid Pi resource discovery stalls

* test: tighten codex event assertions

* test: tighten cron assertions

* Fix Codex app-server OAuth harness auth

* refactor: move agent runtime state toward piless

* refactor: move device and push state to sqlite

* refactor: move runtime json state imports to doctor

* refactor: finish database-first state migration

* chore: refresh generated sqlite db types

* refactor: clarify cron sqlite store keys

* refactor: remove stale file-backed shims

* refactor: bind codex runtime state by session id

* test: expect sqlite trajectory branch export

* refactor: rename session row helpers

* fix: keep legacy device identity import in doctor

* refactor: enforce database-first runtime state

* refactor: consolidate sqlite runtime state

* build: align pi contract wrappers

* chore: repair database-first rebase

* refactor: remove session file test contracts

* test: update gateway session expectations

* refactor: stop routing from session compatibility shadows

* refactor: stop persisting session route shadows

* refactor: use typed delivery context in clients

* refactor: stop echoing session route shadows

* refactor: repair embedded runner rebase imports

# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts

* refactor: align pi contract imports

* refactor: satisfy kysely sync helper guard

* refactor: remove file transcript bridge remnants

* refactor: remove session locator compatibility

* refactor: remove session file test contracts

* refactor: keep rebase database-first clean

* refactor: remove session file assumptions from e2e

* docs: clarify database-first goal state

* test: remove legacy store markers from sqlite runtime tests

* refactor: remove legacy store assumptions from runtime seams

* refactor: align sqlite runtime helper seams

* test: update memory recall sqlite audit mock

* refactor: align database-first runtime type seams

* test: clarify doctor cron legacy store names

* fix: preserve sqlite session route projections

* test: fix copilot token cache test syntax

* docs: update database-first proof status

* test: align database-first test fixtures

* docs: update database-first proof status

* refactor: clean extension database-first drift

* test: align agent session route proof

* test: clarify doctor legacy path fixtures

* chore: clean database-first changed checks

* chore: repair database-first rebase markers

* build: allow baileys git subdependency

* chore: repair exp-vfs rebase drift

* chore: finish exp-vfs rebase cleanup

* chore: satisfy rebase lint drift

* chore: fix qqbot rebase type seam

* chore: fix rebase drift leftovers

* fix: keep auth profile oauth secrets out of sqlite

* fix: repair rebase drift tests

* test: stabilize pairing request ordering

* test: use source manifests in plugin contract checks

* fix: restore gateway session metadata after rebase

* fix: repair database-first rebase drift

* fix: clean up database-first rebase fallout

* test: stabilize line quick reply receipt time

* fix: repair extension rebase drift

* test: keep transcript redaction tests sqlite-backed

* fix: carry injected transcript redaction through sqlite

* chore: clean database branch rebase residue

* fix: repair database branch CI drift

* fix: repair database branch CI guard drift

* fix: stabilize oauth tls preflight test

* test: align database branch fast guards

* test: repair build artifact boundary guards

* chore: clean changelog rebase markers

---------

Co-authored-by: pashpashpash <nik@vault77.ai>
Co-authored-by: Eva <eva@100yen.org>
Co-authored-by: stainlu <stainlu@newtype-ai.org>
Co-authored-by: Jason Zhou <jason.zhou.design@gmail.com>
Co-authored-by: Ruben Cuevas <hi@rubencu.com>
Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com>
Co-authored-by: Shakker <shakkerdroid@gmail.com>
Co-authored-by: Kaspre <36520309+Kaspre@users.noreply.github.com>
Co-authored-by: dataCenter430 <titan032000@gmail.com>
Co-authored-by: Kaspre <kaspre@gmail.com>
Co-authored-by: pandadev66 <nova.full.stack@outlook.com>
Co-authored-by: Eva <admin@100yen.org>
Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>
Co-authored-by: jeffjhunter <support@aipersonamethod.com>

2026-05-13 13:15:12 +01:00

11 KiB

Raw Blame History

summary, read_when, title

summary

read_when

title

Agent loop lifecycle, streams, and wait semantics

You need an exact walkthrough of the agent loop or lifecycle events

You are changing session queueing or transcript writes

Agent loop

An agentic loop is the full "real" run of an agent: intake → context assembly → model inference → tool execution → streaming replies → persistence. It's the authoritative path that turns a message into actions and a final reply, while keeping session state consistent.

In OpenClaw, a loop is a single, serialized run per session that emits lifecycle and stream events as the model thinks, calls tools, and streams output. This doc explains how that authentic loop is wired end-to-end.

Entry points

Gateway RPC: agent and agent.wait.
CLI: agent command.

How it works (high-level)

agent RPC validates params, resolves session (sessionKey/sessionId), persists session metadata, returns { runId, acceptedAt } immediately.
agentCommand runs the agent:
- resolves model + thinking/verbose/trace defaults
- loads skills snapshot
- calls runEmbeddedPiAgent (pi-agent-core runtime)
- emits lifecycle end/error if the embedded loop does not emit one
runEmbeddedPiAgent:
- serializes runs via per-session + global queues
- resolves model + auth profile and builds the pi session
- subscribes to pi events and streams assistant/tool deltas
- enforces timeout -> aborts run if exceeded
- for Codex app-server turns, aborts an accepted turn that stops producing app-server progress before a terminal event
- returns payloads + usage metadata
subscribeEmbeddedPiSession bridges pi-agent-core events to OpenClaw agent stream:
- tool events => stream: "tool"
- assistant deltas => stream: "assistant"
- lifecycle events => stream: "lifecycle" (phase: "start" | "end" | "error")
agent.wait uses waitForAgentRun:
- waits for lifecycle end/error for runId
- returns { status: ok|error|timeout, startedAt, endedAt, error? }

Queueing + concurrency

Runs are serialized per session key (session lane) and optionally through a global lane.
This prevents tool/session races and keeps session history consistent.
Messaging channels can choose queue modes (collect/steer/followup) that feed this lane system. See Command Queue.
Transcript writes persist through SQLite. The old session.writeLock file-lock setting is doctor-migrated legacy config, not runtime behavior.

Session + workspace preparation

Workspace is resolved and created; sandboxed runs may redirect to a sandbox workspace root.
Skills are loaded (or reused from a snapshot) and injected into env and prompt.
Bootstrap/context files are resolved and injected into the system prompt report.
SQLite transcript state is opened by {agentId, sessionId} before streaming. Later transcript rewrite, compaction, or truncation paths mutate those rows directly.

Prompt assembly + system prompt

System prompt is built from OpenClaw's base prompt, skills prompt, bootstrap context, and per-run overrides.
Model-specific limits and compaction reserve tokens are enforced.
See System prompt for what the model sees.

Hook points (where you can intercept)

OpenClaw has two hook systems:

Internal hooks (Gateway hooks): event-driven scripts for commands and lifecycle events.
Plugin hooks: extension points inside the agent/tool lifecycle and gateway pipeline.

Internal hooks (Gateway hooks)

agent:bootstrap: runs while building bootstrap files before the system prompt is finalized. Use this to add/remove bootstrap context files.
Command hooks: /new, /reset, /stop, and other command events (see Hooks doc).

See Hooks for setup and examples.

Plugin hooks (agent + gateway lifecycle)

These run inside the agent loop or gateway pipeline:

before_model_resolve: runs pre-session (no messages) to deterministically override provider/model before model resolution.
before_prompt_build: runs after session load (with messages) to inject prependContext, systemPrompt, prependSystemContext, or appendSystemContext before prompt submission. Use prependContext for per-turn dynamic text and system-context fields for stable guidance that should sit in system prompt space.
before_agent_start: legacy compatibility hook that may run in either phase; prefer the explicit hooks above.
before_agent_reply: runs after inline actions and before the LLM call, letting a plugin claim the turn and return a synthetic reply or silence the turn entirely.
agent_end: inspect the final message list and run metadata after completion.
before_compaction / after_compaction: observe or annotate compaction cycles.
before_tool_call / after_tool_call: intercept tool params/results.
before_install: inspect built-in scan findings and optionally block skill or plugin installs.
tool_result_persist: synchronously transform tool results before they are written to an OpenClaw-owned session transcript.
message_received / message_sending / message_sent: inbound + outbound message hooks.
session_start / session_end: session lifecycle boundaries.
gateway_start / gateway_stop: gateway lifecycle events.

Hook decision rules for outbound/tool guards:

before_tool_call: { block: true } is terminal and stops lower-priority handlers.
before_tool_call: { block: false } is a no-op and does not clear a prior block.
before_install: { block: true } is terminal and stops lower-priority handlers.
before_install: { block: false } is a no-op and does not clear a prior block.
message_sending: { cancel: true } is terminal and stops lower-priority handlers.
message_sending: { cancel: false } is a no-op and does not clear a prior cancel.

See Plugin hooks for the hook API and registration details.

Harnesses may adapt these hooks differently. The Codex app-server harness keeps OpenClaw plugin hooks as the compatibility contract for documented mirrored surfaces, while Codex native hooks remain a separate lower-level Codex mechanism.

Streaming + partial replies

Assistant deltas are streamed from pi-agent-core and emitted as assistant events.
Block streaming can emit partial replies either on text_end or message_end.
Reasoning streaming can be emitted as a separate stream or as block replies.
See Streaming for chunking and block reply behavior.

Tool execution + messaging tools

Tool start/update/end events are emitted on the tool stream.
Tool results are sanitized for size and image payloads before logging/emitting.
Messaging tool sends are tracked to suppress duplicate assistant confirmations.

Reply shaping + suppression

Final payloads are assembled from:
- assistant text (and optional reasoning)
- inline tool summaries (when verbose + allowed)
- assistant error text when the model errors
The exact silent token NO_REPLY / no_reply is filtered from outgoing payloads.
Messaging tool duplicates are removed from the final payload list.
If no renderable payloads remain and a tool errored, a fallback tool error reply is emitted (unless a messaging tool already sent a user-visible reply).

Compaction + retries

Auto-compaction emits compaction stream events and can trigger a retry.
On retry, in-memory buffers and tool summaries are reset to avoid duplicate output.
See Compaction for the compaction pipeline.

Event streams (today)

lifecycle: emitted by subscribeEmbeddedPiSession (and as a fallback by agentCommand)
assistant: streamed deltas from pi-agent-core
tool: streamed tool events from pi-agent-core

Chat channel handling

Assistant deltas are buffered into chat delta messages.
A chat final is emitted on lifecycle end/error.

Timeouts

agent.wait default: 30s (just the wait). timeoutMs param overrides.
Agent runtime: agents.defaults.timeoutSeconds default 172800s (48 hours); enforced in runEmbeddedPiAgent abort timer.
Cron runtime: isolated agent-turn timeoutSeconds is owned by cron. The scheduler starts that timer when execution begins, aborts the underlying run at the configured deadline, then runs bounded cleanup before recording the timeout so a stale child session cannot keep the lane stuck.
Session liveness diagnostics: with diagnostics enabled, diagnostics.stuckSessionWarnMs classifies long processing sessions that have no observed reply, tool, status, block, or ACP progress. Active embedded runs, model calls, and tool calls report as session.long_running; active work with no recent progress reports as session.stalled; session.stuck is reserved for stale session bookkeeping with no active work. Stale session bookkeeping releases the affected session lane immediately; stalled embedded runs are abort-drained only after diagnostics.stuckSessionAbortMs (default: at least 10 minutes and 5x the warning threshold) so queued work can resume without cutting off merely slow runs. Recovery emits structured requested/completed outcomes, and diagnostic state is marked idle only if the same processing generation is still current. Repeated session.stuck diagnostics back off while the session remains unchanged.
Model idle timeout: OpenClaw aborts a model request when no response chunks arrive before the idle window. models.providers.<id>.timeoutSeconds extends this idle watchdog for slow local/self-hosted providers; otherwise OpenClaw uses agents.defaults.timeoutSeconds when configured, capped at 120s by default. Cron-triggered runs with no explicit model or agent timeout disable the idle watchdog and rely on the cron outer timeout.
Provider HTTP request timeout: models.providers.<id>.timeoutSeconds applies to that provider's model HTTP fetches, including connect, headers, body, SDK request timeout, total guarded-fetch abort handling, and model stream idle watchdog. Use this for slow local/self-hosted providers such as Ollama before raising the whole agent runtime timeout.

Where things can end early

Agent timeout (abort)
AbortSignal (cancel)
Gateway disconnect or RPC timeout
agent.wait timeout (wait-only, does not stop agent)

Tools — available agent tools
Hooks — event-driven scripts triggered by agent lifecycle events
Compaction — how long conversations are summarized
Exec Approvals — approval gates for shell commands
Thinking — thinking/reasoning level configuration

11 KiB Raw Blame History