openclaw/docs/plugins/codex-harness-runtime.md at f91de52f0d23c4005ea28abaad7ba7ad78c6b50f

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-18 20:24:46 +00:00

Files

Peter Steinberger f91de52f0d refactor: move runtime state to SQLite

* refactor: remove stale file-backed shims

* fix: harden sqlite state ci boundaries

* refactor: store matrix idb snapshots in sqlite

* fix: satisfy rebased CI guardrails

* refactor: store current conversation bindings in sqlite table

* refactor: store tui last sessions in sqlite table

* refactor: reset sqlite schema history

* refactor: drop unshipped sqlite table migration

* refactor: remove plugin index file rollback

* refactor: drop unshipped sqlite sidecar migrations

* refactor: remove runtime commitments kv migration

* refactor: preserve kysely sync result types

* refactor: drop unshipped sqlite schema migration table

* test: keep session usage coverage sqlite-backed

* refactor: keep sqlite migration doctor-only

* refactor: isolate device legacy imports

* refactor: isolate push voicewake legacy imports

* refactor: isolate remaining runtime legacy imports

* refactor: tighten sqlite migration guardrails

* test: cover sqlite persisted enum parsing

* refactor: isolate legacy update and tui imports

* refactor: tighten sqlite state ownership

* refactor: move legacy imports behind doctor

* refactor: remove legacy session row lookup

* refactor: canonicalize memory transcript locators

* refactor: drop transcript path scope fallbacks

* refactor: drop runtime legacy session delivery pruning

* refactor: store tts prefs only in sqlite

* refactor: remove cron store path runtime

* refactor: use cron sqlite store keys

* refactor: rename telegram message cache scope

* refactor: read memory dreaming status from sqlite

* refactor: rename cron status store key

* refactor: stop remembering transcript file paths

* test: use sqlite locators in agent fixtures

* refactor: remove file-shaped commitments and cron store surfaces

* refactor: keep compaction transcript handles out of session rows

* refactor: derive transcript handles from session identity

* refactor: derive runtime transcript handles

* refactor: remove gateway session locator reads

* refactor: remove transcript locator from session rows

* refactor: store raw stream diagnostics in sqlite

* refactor: remove file-shaped transcript rotation

* refactor: hide legacy trajectory paths from runtime

* refactor: remove runtime transcript file bridges

* refactor: repair database-first rebase fallout

* refactor: align tests with database-first state

* refactor: remove transcript file handoffs

* refactor: sync post-compaction memory by transcript scope

* refactor: run codex app-server sessions by id

* refactor: bind codex runtime state by session id

* refactor: pass memory transcripts by sqlite scope

* refactor: remove transcript locator cleanup leftovers

* test: remove stale transcript file fixtures

* refactor: remove transcript locator test helper

* test: make cron sqlite keys explicit

* test: remove cron runtime store paths

* test: remove stale session file fixtures

* test: use sqlite cron keys in diagnostics

* refactor: remove runtime delivery queue backfill

* test: drop fake export session file mocks

* refactor: rename acp session read failure flag

* refactor: rename acp row session key

* refactor: remove session store test seams

* refactor: move legacy session parser tests to doctor

* refactor: reindex managed memory in place

* refactor: drop stale session store wording

* refactor: rename session row helpers

* refactor: rename sqlite session entry modules

* refactor: remove transcript locator leftovers

* refactor: trim file-era audit wording

* refactor: clean managed media through sqlite

* fix: prefer explicit agent for exports

* fix: use prepared agent for session resets

* fix: canonicalize legacy codex binding import

* test: rename state cleanup helper

* docs: align backup docs with sqlite state

* refactor: drop legacy Pi usage auth fallback

* refactor: move legacy auth profile imports to doctor

* refactor: keep Pi model discovery auth in memory

* refactor: remove MSTeams legacy learning key fallback

* refactor: store model catalog config in sqlite

* refactor: use sqlite model catalog at runtime

* refactor: remove model json compatibility aliases

* refactor: store auth profiles in sqlite

* refactor: seed copied auth profiles in sqlite

* refactor: make auth profile runtime sqlite-addressed

* refactor: migrate hermes secrets into sqlite auth store

* refactor: move plugin install config migration to doctor

* refactor: rename plugin index audit checks

* test: drop auth file assumptions

* test: remove legacy transcript file assertions

* refactor: drop legacy cli session aliases

* refactor: store skill uploads in sqlite

* refactor: keep subagent attachments in sqlite vfs

* refactor: drop subagent attachment cleanup state

* refactor: move legacy session aliases to doctor

* refactor: require node 24 for sqlite state runtime

* refactor: move provider caches into sqlite state

* fix: harden virtual agent filesystem

* refactor: enforce database-first runtime state

* refactor: rename compaction transcript rotation setting

* test: clean sqlite refactor test types

* refactor: consolidate sqlite runtime state

* refactor: model session conversations in sqlite

* refactor: stop deriving cron delivery from session keys

* refactor: stop classifying sessions from key shape

* refactor: hydrate announce targets from typed delivery

* refactor: route heartbeat delivery from typed sqlite context

* refactor: tighten typed sqlite session routing

* refactor: remove session origin routing shadow

* refactor: drop session origin shadow fixtures

* perf: query sqlite vfs paths by prefix

* refactor: use typed conversation metadata for sessions

* refactor: prefer typed session routing metadata

* refactor: require typed session routing metadata

* refactor: resolve group tool policy from typed sessions

* refactor: delete dead session thread info bridge

* Show Codex subscription reset times in channel errors (#80456)

* feat(plugin-sdk): consolidate session workflow APIs

* fix(agents): allow read-only agent mount reads

* [codex] refresh plugin regression fixtures

* fix(agents): restore compaction gateway logs

* test: tighten gateway startup assertions

* Redact persisted secret-shaped payloads [AI] (#79006)

* test: tighten device pair notify assertions

* test: tighten hermes secret assertions

* test: assert matrix client error shapes

* test: assert config compat warnings

* fix(heartbeat): remap cron-run exec events to session keys (#80214)

* fix(codex): route btw through native side threads

* fix(auth): accept friendly OpenAI order for Codex profiles

* fix(codex): rotate auth profiles inside harness

* fix: keep browser status page probe within timeout

* test: assert agents add outputs

* test: pin cron read status

* fix(agents): avoid Pi resource discovery stalls

Co-authored-by: dataCenter430 <titan032000@gmail.com>

* fix: retire timed-out codex app-server clients

* test: tighten qa lab runtime assertions

* test: check security fix outputs

* test: verify extension runtime messages

* feat(wake): expose typed sessionKey on wake protocol + system event CLI

* fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790)

* test: guard talk consult call helper

* fix(codex): scale context engine projection (#80761)

* fix(codex): scale context engine projection

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* fix: document Codex context projection scaling

* chore: align Codex projection changelog

* chore: realign Codex projection changelog

* fix: isolate Codex projection patch

---------

Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>

* refactor: move agent runtime state toward piless

* refactor: remove cron session reaper

* refactor: move session management to sqlite

* refactor: finish database-first state migration

* chore: refresh generated sqlite db types

* refactor: remove stale file-backed shims

* test: harden kysely type coverage

# Conflicts:
#	.agents/skills/kysely-database-access/SKILL.md
#	src/infra/kysely-sync.types.test.ts
#	src/proxy-capture/store.sqlite.test.ts
#	src/state/openclaw-agent-db.test.ts
#	src/state/openclaw-state-db.test.ts

* refactor: remove cron store path runtime

* refactor: keep compaction transcript handles out of session rows

* refactor: derive embedded transcripts from sqlite identity

* refactor: remove embedded transcript locator handoff

* refactor: remove runtime transcript file bridges

* refactor: remove transcript file handoffs

* refactor: remove MSTeams legacy learning key fallback

* refactor: store model catalog config in sqlite

* refactor: use sqlite model catalog at runtime

# Conflicts:
#	docs/cli/secrets.md
#	docs/gateway/authentication.md
#	docs/gateway/secrets.md

* fix: keep oauth sibling sync sqlite-local

# Conflicts:
#	src/commands/onboard-auth.test.ts

* refactor: remove task session store maintenance

# Conflicts:
#	src/commands/tasks.ts

* refactor: keep diagnostics in state sqlite

* refactor: enforce database-first runtime state

* refactor: consolidate sqlite runtime state

* Show Codex subscription reset times in channel errors (#80456)

* fix(codex): refresh subscription limit resets

* fix(codex): format reset times for channels

* Update CHANGELOG with latest changes and fixes

Updated CHANGELOG with recent fixes and improvements.

* fix(codex): keep command load failures on codex surface

* fix(codex): format account rate limits as rows

* fix(codex): summarize account limits as usage status

* fix(codex): simplify account limit status

* test: tighten subagent announce queue assertion

* test: tighten session delete lifecycle assertions

* test: tighten cron ops assertions

* fix: track cron execution milestones

* test: tighten hermes secret assertions

* test: assert matrix sync store payloads

* test: assert config compat warnings

* fix(codex): align btw side thread semantics

* fix(codex): honor codex fallback blocking

* fix(agents): avoid Pi resource discovery stalls

* test: tighten codex event assertions

* test: tighten cron assertions

* Fix Codex app-server OAuth harness auth

* refactor: move agent runtime state toward piless

* refactor: move device and push state to sqlite

* refactor: move runtime json state imports to doctor

* refactor: finish database-first state migration

* chore: refresh generated sqlite db types

* refactor: clarify cron sqlite store keys

* refactor: remove stale file-backed shims

* refactor: bind codex runtime state by session id

* test: expect sqlite trajectory branch export

* refactor: rename session row helpers

* fix: keep legacy device identity import in doctor

* refactor: enforce database-first runtime state

* refactor: consolidate sqlite runtime state

* build: align pi contract wrappers

* chore: repair database-first rebase

* refactor: remove session file test contracts

* test: update gateway session expectations

* refactor: stop routing from session compatibility shadows

* refactor: stop persisting session route shadows

* refactor: use typed delivery context in clients

* refactor: stop echoing session route shadows

* refactor: repair embedded runner rebase imports

# Conflicts:
#	src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts

* refactor: align pi contract imports

* refactor: satisfy kysely sync helper guard

* refactor: remove file transcript bridge remnants

* refactor: remove session locator compatibility

* refactor: remove session file test contracts

* refactor: keep rebase database-first clean

* refactor: remove session file assumptions from e2e

* docs: clarify database-first goal state

* test: remove legacy store markers from sqlite runtime tests

* refactor: remove legacy store assumptions from runtime seams

* refactor: align sqlite runtime helper seams

* test: update memory recall sqlite audit mock

* refactor: align database-first runtime type seams

* test: clarify doctor cron legacy store names

* fix: preserve sqlite session route projections

* test: fix copilot token cache test syntax

* docs: update database-first proof status

* test: align database-first test fixtures

* docs: update database-first proof status

* refactor: clean extension database-first drift

* test: align agent session route proof

* test: clarify doctor legacy path fixtures

* chore: clean database-first changed checks

* chore: repair database-first rebase markers

* build: allow baileys git subdependency

* chore: repair exp-vfs rebase drift

* chore: finish exp-vfs rebase cleanup

* chore: satisfy rebase lint drift

* chore: fix qqbot rebase type seam

* chore: fix rebase drift leftovers

* fix: keep auth profile oauth secrets out of sqlite

* fix: repair rebase drift tests

* test: stabilize pairing request ordering

* test: use source manifests in plugin contract checks

* fix: restore gateway session metadata after rebase

* fix: repair database-first rebase drift

* fix: clean up database-first rebase fallout

* test: stabilize line quick reply receipt time

* fix: repair extension rebase drift

* test: keep transcript redaction tests sqlite-backed

* fix: carry injected transcript redaction through sqlite

* chore: clean database branch rebase residue

* fix: repair database branch CI drift

* fix: repair database branch CI guard drift

* fix: stabilize oauth tls preflight test

* test: align database branch fast guards

* test: repair build artifact boundary guards

* chore: clean changelog rebase markers

---------

Co-authored-by: pashpashpash <nik@vault77.ai>
Co-authored-by: Eva <eva@100yen.org>
Co-authored-by: stainlu <stainlu@newtype-ai.org>
Co-authored-by: Jason Zhou <jason.zhou.design@gmail.com>
Co-authored-by: Ruben Cuevas <hi@rubencu.com>
Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com>
Co-authored-by: Shakker <shakkerdroid@gmail.com>
Co-authored-by: Kaspre <36520309+Kaspre@users.noreply.github.com>
Co-authored-by: dataCenter430 <titan032000@gmail.com>
Co-authored-by: Kaspre <kaspre@gmail.com>
Co-authored-by: pandadev66 <nova.full.stack@outlook.com>
Co-authored-by: Eva <admin@100yen.org>
Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>
Co-authored-by: jeffjhunter <support@aipersonamethod.com>

2026-05-13 13:15:12 +01:00

16 KiB

Raw Blame History

summary, title, read_when

summary

title

read_when

Runtime boundaries, hooks, tools, permissions, and diagnostics for the Codex harness

Codex harness runtime

You need the Codex harness runtime support contract

You are debugging native Codex tools, hooks, compaction, or feedback upload

You are changing plugin behavior across PI and Codex harness turns

This page documents the runtime contract for Codex harness turns. For setup and routing, start with Codex harness. For config fields, see Codex harness reference.

Overview

Codex mode is not PI with a different model call underneath. Codex owns more of the native model loop, and OpenClaw adapts its plugin, tool, session, and diagnostic surfaces around that boundary.

OpenClaw still owns channel routing, SQLite session state, visible message delivery, OpenClaw dynamic tools, approvals, media delivery, and a transcript mirror. Codex owns the canonical native thread, native model loop, native tool continuation, and native compaction.

Thread bindings and model changes

When an OpenClaw session is attached to an existing Codex thread, the next turn sends the currently selected OpenAI model, approval policy, sandbox, and service tier to app-server again. Switching from openai/gpt-5.5 to openai/gpt-5.2 keeps the thread binding but asks Codex to continue with the newly selected model.

Visible replies and heartbeats

When a source chat turn runs through the Codex harness, visible replies default to the OpenClaw message tool if the deployment has not explicitly configured messages.visibleReplies. The agent can still finish its Codex turn privately; it only posts to the channel when it calls message(action="send"). Set messages.visibleReplies: "automatic" to keep direct-chat final replies on the legacy automatic delivery path.

Codex heartbeat turns also get heartbeat_respond in the searchable OpenClaw tool catalog by default, so the agent can record whether the wake should stay quiet or notify without encoding that control flow in final text.

Heartbeat-specific initiative guidance is sent as a Codex collaboration-mode developer instruction on the heartbeat turn itself. Ordinary chat turns restore Codex Default mode instead of carrying heartbeat philosophy in their normal runtime prompt.

Hook boundaries

The Codex harness has three hook layers:

Layer	Owner	Purpose
OpenClaw plugin hooks	OpenClaw	Product/plugin compatibility across PI and Codex harnesses.
Codex app-server extension middleware	OpenClaw bundled plugins	Per-turn adapter behavior around OpenClaw dynamic tools.
Codex native hooks	Codex	Low-level Codex lifecycle and native tool policy from Codex config.

OpenClaw does not use project or global Codex hooks.json files to route OpenClaw plugin behavior. For the supported native tool and permission bridge, OpenClaw injects per-thread Codex config for PreToolUse, PostToolUse, PermissionRequest, and Stop.

When Codex app-server approvals are enabled, meaning approvalPolicy is not "never", the default injected native hook config omits PermissionRequest so Codex's app-server reviewer and OpenClaw's approval bridge handle real escalations after review. Operators can explicitly add permission_request to nativeHookRelay.events when they need the compatibility relay.

Other Codex hooks such as SessionStart and UserPromptSubmit remain Codex-level controls. They are not exposed as OpenClaw plugin hooks in the v1 contract.

For OpenClaw dynamic tools, OpenClaw executes the tool after Codex asks for the call, so OpenClaw fires the plugin and middleware behavior it owns in the harness adapter. For Codex-native tools, Codex owns the canonical tool record. OpenClaw can mirror selected events, but it cannot rewrite the native Codex thread unless Codex exposes that operation through app-server or native hook callbacks.

Codex app-server item notifications also provide async after_tool_call observations for native tool completions that are not already covered by the native PostToolUse relay. These observations are for telemetry and plugin compatibility only; they cannot block, delay, or mutate the native tool call.

Compaction and LLM lifecycle projections come from Codex app-server notifications and OpenClaw adapter state, not native Codex hook commands. OpenClaw's before_compaction, after_compaction, llm_input, and llm_output events are adapter-level observations, not byte-for-byte captures of Codex's internal request or compaction payloads.

Codex native hook/started and hook/completed app-server notifications are projected as codex_app_server.hook agent events for trajectory and debugging. They do not invoke OpenClaw plugin hooks.

V1 support contract

Supported in Codex runtime v1:

Surface	Support	Why
OpenAI model loop through Codex	Supported	Codex app-server owns the OpenAI turn, native thread resume, and native tool continuation.
OpenClaw channel routing and delivery	Supported	Telegram, Discord, Slack, WhatsApp, iMessage, and other channels stay outside the model runtime.
OpenClaw dynamic tools	Supported	Codex asks OpenClaw to execute these tools, so OpenClaw stays in the execution path.
Prompt and context plugins	Supported	OpenClaw builds prompt overlays and projects context into the Codex turn before starting or resuming the thread.
Context engine lifecycle	Supported	Assemble, ingest, after-turn maintenance, and context-engine compaction coordination run for Codex turns.
Dynamic tool hooks	Supported	`before_tool_call`, `after_tool_call`, and tool-result middleware run around OpenClaw-owned dynamic tools.
Lifecycle hooks	Supported as adapter observations	`llm_input`, `llm_output`, `agent_end`, `before_compaction`, and `after_compaction` fire with honest Codex-mode payloads.
Final-answer revision gate	Supported through native hook relay	Codex `Stop` is relayed to `before_agent_finalize`; `revise` asks Codex for one more model pass before finalization.
Native shell, patch, and MCP block or observe	Supported through native hook relay	Codex `PreToolUse` and `PostToolUse` are relayed for committed native tool surfaces, including MCP payloads on Codex app-server `0.125.0` or newer. Blocking is supported; argument rewriting is not.
Native permission policy	Supported through Codex app-server approvals and compatibility native hook relay	Codex app-server approval requests route through OpenClaw after Codex review. The `PermissionRequest` native hook relay is opt-in for native approval modes because Codex emits it before guardian review.
App-server trajectory capture	Supported	OpenClaw records the request it sent to app-server and the app-server notifications it receives.

Not supported in Codex runtime v1:

Surface	V1 boundary	Future path
Native tool argument mutation	Codex native pre-tool hooks can block, but OpenClaw does not rewrite Codex-native tool arguments.	Requires Codex hook/schema support for replacement tool input.
Editable Codex-native transcript history	Codex owns canonical native thread history. OpenClaw owns a mirror and can project future context, but should not mutate unsupported internals.	Add explicit Codex app-server APIs if native thread surgery is needed.
`tool_result_persist` for Codex-native tool records	That hook transforms OpenClaw-owned transcript writes, not Codex-native tool records.	Could mirror transformed records, but canonical rewrite needs Codex support.
Rich native compaction metadata	OpenClaw observes compaction start and completion, but does not receive a stable kept/dropped list, token delta, or summary payload.	Needs richer Codex compaction events.
Compaction intervention	Current OpenClaw compaction hooks are notification-level in Codex mode.	Add Codex pre/post compaction hooks if plugins need to veto or rewrite native compaction.
Byte-for-byte model API request capture	OpenClaw can capture app-server requests and notifications, but Codex core builds the final OpenAI API request internally.	Needs a Codex model-request tracing event or debug API.

Native permissions and MCP elicitations

For PermissionRequest, OpenClaw only returns explicit allow or deny decisions when policy decides. A no-decision result is not an allow. Codex treats it as no hook decision and falls through to its own guardian or user approval path.

Codex app-server approval modes omit this native hook by default. This behavior applies when permission_request is explicitly included in nativeHookRelay.events or a compatibility runtime installs it.

When an operator chooses allow-always for a Codex native permission request, OpenClaw remembers that exact provider/session/tool input/cwd fingerprint for a bounded session window. The remembered decision is intentionally exact-match only: a changed command, arguments, tool payload, or cwd creates a fresh approval.

Codex MCP tool approval elicitations are routed through OpenClaw's plugin approval flow when Codex marks _meta.codex_approval_kind as "mcp_tool_call". Codex request_user_input prompts are sent back to the originating chat, and the next queued follow-up message answers that native server request instead of being steered as extra context. Other MCP elicitation requests fail closed.

Queue steering

Active-run queue steering maps onto Codex app-server turn/steer. With the default messages.queue.mode: "steer", OpenClaw batches queued chat messages for the configured quiet window and sends them as one turn/steer request in arrival order. Legacy queue mode sends separate turn/steer requests.

Codex review and manual compaction turns can reject same-turn steering. In that case, OpenClaw uses the follow-up queue when the selected mode allows fallback. See Steering queue.

Codex feedback upload

When /diagnostics [note] is approved for a session using the native Codex harness, OpenClaw also calls Codex app-server feedback/upload for relevant Codex threads. The upload asks app-server to include logs for each listed thread and spawned Codex subthreads when available.

The upload goes through Codex's normal feedback path to OpenAI servers. If Codex feedback is disabled in that app-server, the command returns the app-server error. The completed diagnostics reply lists the channels, OpenClaw session ids, Codex thread ids, and local codex resume <thread-id> commands for the threads that were sent.

If you deny or ignore the approval, OpenClaw does not print those Codex ids and does not send Codex feedback. The upload does not replace the local Gateway diagnostics export. See Diagnostics export for the approval, privacy, local bundle, and group-chat behavior.

Use /codex diagnostics [note] only when you specifically want the Codex feedback upload for the currently attached thread without the full Gateway diagnostics bundle.

Compaction and transcript mirror

When the selected model uses the Codex harness, native thread compaction is delegated to Codex app-server. OpenClaw keeps a transcript mirror for channel history, search, /new, /reset, and future model or harness switching.

The mirror includes the user prompt, final assistant text, and lightweight Codex reasoning or plan records when the app-server emits them. Today, OpenClaw only records native compaction start and completion signals. It does not yet expose a human-readable compaction summary or an auditable list of which entries Codex kept after compaction.

Because Codex owns the canonical native thread, tool_result_persist does not currently rewrite Codex-native tool result records. It only applies when OpenClaw is writing an OpenClaw-owned session transcript tool result.

Media and delivery

OpenClaw continues to own media delivery and media provider selection. Image, video, music, PDF, TTS, and media understanding use matching provider/model settings such as agents.defaults.imageGenerationModel, videoGenerationModel, pdfModel, and messages.tts.

Text, images, video, music, TTS, approvals, and messaging-tool output continue through the normal OpenClaw delivery path. Media generation does not require PI. When Codex emits a native image-generation item with a savedPath, OpenClaw forwards that exact file through the normal reply-media path even if the Codex turn has no assistant text.

16 KiB Raw Blame History