Document agent runtimes and the Codex v1 contract (#71270)

* Document agent runtimes and Codex v1 contract

* Document agent runtimes and Codex v1 contract

* Clarify Codex runtime fallback docs

* Clarify runtime and harness terminology
This commit is contained in:
pashpashpash
2026-04-24 18:46:24 -07:00
committed by GitHub
parent 0970507078
commit 42ec7a868f
8 changed files with 268 additions and 37 deletions

View File

@@ -11,6 +11,30 @@
"source": "Pi",
"target": "Pi"
},
{
"source": "Agent runtimes",
"target": "Agent Runtimes"
},
{
"source": "Agent Runtimes",
"target": "Agent Runtimes"
},
{
"source": "Codex harness",
"target": "Codex harness"
},
{
"source": "Agent harness plugins",
"target": "Agent harness plugins"
},
{
"source": "Agent loop",
"target": "Agent loop"
},
{
"source": "Models",
"target": "Models"
},
{
"source": "Skills",
"target": "Skills"

View File

@@ -21,7 +21,7 @@ Notes:
- `--deep` runs live probes (WhatsApp Web + Telegram + Discord + Slack + Signal).
- `--usage` prints normalized provider usage windows as `X% left`.
- Session status output separates `Execution:` from `Runtime:`. `Execution` is the sandbox path (`direct`, `docker/*`), while `Runtime` tells you whether the session is using `OpenClaw Pi Default`, `OpenAI Codex`, a CLI backend, or an ACP backend such as `codex (acp/acpx)`.
- Session status output separates `Execution:` from `Runtime:`. `Execution` is the sandbox path (`direct`, `docker/*`), while `Runtime` tells you whether the session is using `OpenClaw Pi Default`, `OpenAI Codex`, a CLI backend, or an ACP backend such as `codex (acp/acpx)`. See [Agent runtimes](/concepts/agent-runtimes) for the provider/model/runtime distinction.
- MiniMax's raw `usage_percent` / `usagePercent` fields are remaining quota, so OpenClaw inverts them before display; count-based fields win when present. `model_remains` responses prefer the chat-model entry, derive the window label from timestamps when needed, and include the model name in the plan label.
- When the current session snapshot is sparse, `/status` can backfill token and cache counters from the most recent transcript usage log. Existing nonzero live values still win over transcript fallback values.
- Transcript fallback can also recover the active runtime model label when the live session entry is missing it. If that transcript model differs from the selected model, status resolves the context window against the recovered runtime model instead of the selected one.

View File

@@ -0,0 +1,127 @@
---
summary: "How OpenClaw separates model providers, models, channels, and agent runtimes"
title: "Agent runtimes"
read_when:
- You are choosing between PI, Codex, ACP, or another native agent runtime
- You are confused by provider/model/runtime labels in status or config
- You are documenting support parity for a native harness
---
An **agent runtime** is the component that owns one prepared model loop: it
receives the prompt, drives model output, handles native tool calls, and returns
the finished turn to OpenClaw.
Runtimes are easy to confuse with providers because both show up near model
configuration. They are different layers:
| Layer | Examples | What It Means |
| ------------- | ------------------------------------- | ------------------------------------------------------------------- |
| Provider | `openai`, `anthropic`, `openai-codex` | How OpenClaw authenticates, discovers models, and names model refs. |
| Model | `gpt-5.5`, `claude-opus-4-6` | The model selected for the agent turn. |
| Agent runtime | `pi`, `codex`, ACP-backed runtimes | The low level loop that executes the prepared turn. |
| Channel | Telegram, Discord, Slack, WhatsApp | Where messages enter and leave OpenClaw. |
You will also see the word **harness** in code and config. A harness is the
implementation that provides an agent runtime. For example, the bundled Codex
harness implements the `codex` runtime. The config key is still named
`embeddedHarness` for compatibility, but user-facing docs and status output
should generally say runtime.
The common Codex setup uses the `openai` provider with the `codex` runtime:
```json5
{
agents: {
defaults: {
model: "openai/gpt-5.5",
embeddedHarness: {
runtime: "codex",
},
},
},
}
```
That means OpenClaw selects an OpenAI model ref, then asks the Codex app-server
runtime to run the embedded agent turn. It does not mean the channel, model
provider catalog, or OpenClaw session store becomes Codex.
## Runtime ownership
Different runtimes own different amounts of the loop.
| Surface | OpenClaw PI embedded | Codex app-server |
| --------------------------- | --------------------------------------- | --------------------------------------------------------------------------- |
| Model loop owner | OpenClaw through the PI embedded runner | Codex app-server |
| Canonical thread state | OpenClaw transcript | Codex thread, plus OpenClaw transcript mirror |
| OpenClaw dynamic tools | Native OpenClaw tool loop | Bridged through the Codex adapter |
| Native shell and file tools | PI/OpenClaw path | Codex-native tools, bridged through native hooks where supported |
| Context engine | Native OpenClaw context assembly | OpenClaw projects assembled context into the Codex turn |
| Compaction | OpenClaw or selected context engine | Codex-native compaction, with OpenClaw notifications and mirror maintenance |
| Channel delivery | OpenClaw | OpenClaw |
This ownership split is the main design rule:
- If OpenClaw owns the surface, OpenClaw can provide normal plugin hook behavior.
- If the native runtime owns the surface, OpenClaw needs runtime events or native hooks.
- If the native runtime owns canonical thread state, OpenClaw should mirror and project context, not rewrite unsupported internals.
## Runtime selection
OpenClaw chooses an embedded runtime after provider and model resolution:
1. A session's recorded runtime wins. Config changes do not hot-switch an
existing transcript to a different native thread system.
2. `OPENCLAW_AGENT_RUNTIME=<id>` forces that runtime for new or reset sessions.
3. `agents.defaults.embeddedHarness.runtime` or
`agents.list[].embeddedHarness.runtime` can set `auto`, `pi`, or a registered
runtime id such as `codex`.
4. In `auto` mode, registered plugin runtimes can claim supported provider/model
pairs.
5. If no runtime claims a turn in `auto` mode and `fallback: "pi"` is set
(the default), OpenClaw uses PI as the compatibility fallback. Set
`fallback: "none"` to make unmatched `auto`-mode selection fail instead.
Explicit plugin runtimes fail closed by default. For example,
`runtime: "codex"` means Codex or a clear selection error unless you set
`fallback: "pi"` in the same override scope.
## Compatibility contract
When a runtime is not PI, it should document what OpenClaw surfaces it supports.
Use this shape for runtime docs:
| Question | Why It Matters |
| -------------------------------------- | ------------------------------------------------------------------------------------------------- |
| Who owns the model loop? | Determines where retries, tool continuation, and final answer decisions happen. |
| Who owns canonical thread history? | Determines whether OpenClaw can edit history or only mirror it. |
| Do OpenClaw dynamic tools work? | Messaging, sessions, cron, and OpenClaw-owned tools rely on this. |
| Do dynamic tool hooks work? | Plugins expect `before_tool_call`, `after_tool_call`, and middleware around OpenClaw-owned tools. |
| Do native tool hooks work? | Shell, patch, and runtime-owned tools need native hook support for policy and observation. |
| Does the context engine lifecycle run? | Memory and context plugins depend on assemble, ingest, after-turn, and compaction lifecycle. |
| What compaction data is exposed? | Some plugins only need notifications, while others need kept/dropped metadata. |
| What is intentionally unsupported? | Users should not assume PI equivalence where the native runtime owns more state. |
The Codex runtime support contract is documented in
[Codex harness](/plugins/codex-harness#v1-support-contract).
## Status labels
Status output may show both `Execution` and `Runtime` labels. Read them as
diagnostics, not as provider names.
- A model ref such as `openai/gpt-5.5` tells you the selected provider/model.
- A runtime id such as `codex` tells you which loop is executing the turn.
- A channel label such as Telegram or Discord tells you where the conversation is happening.
If a session still shows PI after changing runtime config, start a new session
with `/new` or clear the current one with `/reset`. Existing sessions keep their
recorded runtime so a transcript is not replayed through two incompatible native
session systems.
## Related
- [Codex harness](/plugins/codex-harness)
- [Agent harness plugins](/plugins/sdk-agent-harness)
- [Agent loop](/concepts/agent-loop)
- [Models](/concepts/models)

View File

@@ -10,6 +10,11 @@ title: "Models CLI"
See [/concepts/model-failover](/concepts/model-failover) for auth profile
rotation, cooldowns, and how that interacts with fallbacks.
Quick provider overview + examples: [/concepts/model-providers](/concepts/model-providers).
Model refs choose a provider and model. They do not usually choose the
low-level agent runtime. For example, `openai/gpt-5.5` can run through the
normal OpenAI provider path or through the Codex app-server runtime, depending
on `agents.defaults.embeddedHarness.runtime`. See
[/concepts/agent-runtimes](/concepts/agent-runtimes).
## How model selection works
@@ -278,6 +283,7 @@ This applies whenever OpenClaw regenerates `models.json`, including command-driv
## Related
- [Model Providers](/concepts/model-providers) — provider routing and auth
- [Agent Runtimes](/concepts/agent-runtimes) — PI, Codex, and other agent loop runtimes
- [Model Failover](/concepts/model-failover) — fallback chains
- [Image Generation](/tools/image-generation) — image model configuration
- [Music Generation](/tools/music-generation) — music model configuration

View File

@@ -1085,6 +1085,7 @@
"concepts/architecture",
"concepts/agent",
"concepts/agent-loop",
"concepts/agent-runtimes",
"concepts/system-prompt",
"concepts/context",
"concepts/context-engine",

View File

@@ -369,7 +369,7 @@ Time format in system prompt. Default: `auto` (OS preference).
- For direct OpenAI Responses models, server-side compaction is enabled automatically. Use `params.responsesServerCompaction: false` to stop injecting `context_management`, or `params.responsesCompactThreshold` to override the threshold. See [OpenAI server-side compaction](/providers/openai#server-side-compaction-responses-api).
- `params`: global default provider parameters applied to all models. Set at `agents.defaults.params` (e.g. `{ cacheRetention: "long" }`).
- `params` merge precedence (config): `agents.defaults.params` (global base) is overridden by `agents.defaults.models["provider/model"].params` (per-model), then `agents.list[].params` (matching agent id) overrides by key. See [Prompt Caching](/reference/prompt-caching) for details.
- `embeddedHarness`: default low-level embedded agent runtime policy. Omitted runtime defaults to OpenClaw Pi. Use `runtime: "pi"` to force the built-in PI harness, `runtime: "auto"` to let registered plugin harnesses claim supported models, or a registered harness id such as `runtime: "codex"`. Set `fallback: "none"` to disable automatic PI fallback. Keep model refs canonical as `provider/model`; select Codex, Claude CLI, Gemini CLI, and other execution backends through runtime config instead of legacy runtime provider prefixes.
- `embeddedHarness`: default low-level embedded agent runtime policy. Omitted runtime defaults to OpenClaw Pi. Use `runtime: "pi"` to force the built-in PI harness, `runtime: "auto"` to let registered plugin harnesses claim supported models, or a registered harness id such as `runtime: "codex"`. Set `fallback: "none"` to disable automatic PI fallback. Explicit plugin runtimes such as `codex` fail closed by default unless you set `fallback: "pi"` in the same override scope. Keep model refs canonical as `provider/model`; select Codex, Claude CLI, Gemini CLI, and other execution backends through runtime config instead of legacy runtime provider prefixes. See [Agent runtimes](/concepts/agent-runtimes) for how this differs from provider/model selection.
- Config writers that mutate these fields (for example `/models set`, `/models set-image`, and fallback add/remove commands) save canonical object form and preserve existing fallback lists when possible.
- `maxConcurrent`: max parallel agent runs across sessions (each session still serialized). Default: 4.
@@ -378,7 +378,8 @@ Time format in system prompt. Default: `auto` (OS preference).
`embeddedHarness` controls which low-level executor runs embedded agent turns.
Most deployments should keep the default OpenClaw Pi runtime.
Use it when a trusted plugin provides a native harness, such as the bundled
Codex app-server harness.
Codex app-server harness. For the mental model, see
[Agent runtimes](/concepts/agent-runtimes).
```json5
{

View File

@@ -15,6 +15,11 @@ discovery, native thread resume, native compaction, and app-server execution.
OpenClaw still owns chat channels, session files, model selection, tools,
approvals, media delivery, and the visible transcript mirror.
If you are trying to orient yourself, start with
[Agent runtimes](/concepts/agent-runtimes). The short version is:
`openai/gpt-5.5` is the model ref, `codex` is the runtime, and Telegram,
Discord, Slack, or another channel remains the communication surface.
Native Codex turns keep OpenClaw plugin hooks as the public compatibility layer.
These are in-process OpenClaw hooks, not Codex `hooks.json` command hooks:
@@ -124,7 +129,6 @@ Use `openai/gpt-5.5`, enable the bundled plugin, and force the `codex` harness:
model: "openai/gpt-5.5",
embeddedHarness: {
runtime: "codex",
fallback: "none",
},
},
},
@@ -150,11 +154,24 @@ Legacy configs that set `agents.defaults.model` or an agent model to
`codex/<model>` still auto-enable the bundled `codex` plugin. New configs should
prefer `openai/<model>` plus the explicit `embeddedHarness` entry above.
## Add Codex without replacing other models
## Add Codex alongside other models
Keep `runtime: "auto"` when you want legacy `codex/*` refs to select Codex and
PI for everything else. For new configs, prefer explicit `runtime: "codex"` on
the agents that should use the harness.
Do not set `runtime: "codex"` globally if the same agent should freely switch
between Codex and non-Codex provider models. A forced runtime applies to every
embedded turn for that agent or session. If you select an Anthropic model while
that runtime is forced, OpenClaw still tries the Codex harness and fails closed
instead of silently routing that turn through PI.
Use one of these shapes instead:
- Put Codex on a dedicated agent with `embeddedHarness.runtime: "codex"`.
- Keep the default agent on `runtime: "auto"` and PI fallback for normal mixed
provider usage.
- Use legacy `codex/*` refs only for compatibility. New configs should prefer
`openai/*` plus an explicit Codex runtime policy.
For example, this keeps the default agent on normal automatic selection and
adds a separate Codex agent:
```json5
{
@@ -167,28 +184,36 @@ the agents that should use the harness.
},
agents: {
defaults: {
model: {
primary: "openai/gpt-5.5",
fallbacks: ["openai/gpt-5.5", "anthropic/claude-opus-4-6"],
},
models: {
"openai/gpt-5.5": { alias: "gpt" },
"anthropic/claude-opus-4-6": { alias: "opus" },
},
embeddedHarness: {
runtime: "codex",
runtime: "auto",
fallback: "pi",
},
},
list: [
{
id: "main",
default: true,
model: "anthropic/claude-opus-4-6",
},
{
id: "codex",
name: "Codex",
model: "openai/gpt-5.5",
embeddedHarness: {
runtime: "codex",
},
},
],
},
}
```
With this shape:
- `/model gpt` or `/model openai/gpt-5.5` uses the Codex app-server harness for this config.
- `/model opus` uses the Anthropic provider path.
- If a non-Codex model is selected, PI remains the compatibility harness.
- The default `main` agent uses the normal provider path and PI compatibility fallback.
- The `codex` agent uses the Codex app-server harness.
- If Codex is missing or unsupported for the `codex` agent, the turn fails
instead of quietly using PI.
## Codex-only deployments
@@ -418,12 +443,17 @@ Local Codex with default stdio transport:
}
```
Codex-only harness validation, with PI fallback disabled:
Codex-only harness validation:
```json5
{
embeddedHarness: {
fallback: "none",
agents: {
defaults: {
model: "openai/gpt-5.5",
embeddedHarness: {
runtime: "codex",
},
},
},
plugins: {
entries: {
@@ -545,6 +575,40 @@ Codex native `hook/started` and `hook/completed` app-server notifications are
projected as `codex_app_server.hook` agent events for trajectory and debugging.
They do not invoke OpenClaw plugin hooks.
## V1 support contract
Codex mode is not PI with a different model call underneath. Codex owns more of
the native model loop, and OpenClaw adapts its plugin and session surfaces
around that boundary.
Supported in Codex runtime v1:
| Surface | Support | Why |
| --------------------------------------- | --------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| OpenAI model loop through Codex | Supported | Codex app-server owns the OpenAI turn, native thread resume, and native tool continuation. |
| OpenClaw channel routing and delivery | Supported | Telegram, Discord, Slack, WhatsApp, iMessage, and other channels stay outside the model runtime. |
| OpenClaw dynamic tools | Supported | Codex asks OpenClaw to execute these tools, so OpenClaw stays in the execution path. |
| Prompt and context plugins | Supported | OpenClaw builds prompt overlays and projects context into the Codex turn before starting or resuming the thread. |
| Context engine lifecycle | Supported | Assemble, ingest or after-turn maintenance, and context-engine compaction coordination run for Codex turns. |
| Dynamic tool hooks | Supported | `before_tool_call`, `after_tool_call`, and tool-result middleware run around OpenClaw-owned dynamic tools. |
| Lifecycle hooks | Supported as adapter observations | `llm_input`, `llm_output`, `agent_end`, `before_compaction`, and `after_compaction` fire with honest Codex-mode payloads. |
| Native shell and patch block or observe | Supported through the native hook relay | Codex `PreToolUse` and `PostToolUse` are relayed for supported Codex-native tools. Blocking is supported; argument rewriting is not. |
| Native permission policy | Supported through the native hook relay | Codex `PermissionRequest` can be routed through OpenClaw policy where the runtime exposes it. |
| App-server trajectory capture | Supported | OpenClaw records the request it sent to app-server and the app-server notifications it receives. |
Not supported in Codex runtime v1:
| Surface | V1 Boundary | Future Path |
| --------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| Native tool argument mutation | Codex native pre-tool hooks can block, but OpenClaw does not rewrite Codex-native tool arguments. | Requires Codex hook/schema support for replacement tool input. |
| Editable Codex-native transcript history | Codex owns canonical native thread history. OpenClaw owns a mirror and can project future context, but should not mutate unsupported internals. | Add explicit Codex app-server APIs if native thread surgery is needed. |
| `tool_result_persist` for Codex-native tool records | That hook transforms OpenClaw-owned transcript writes, not Codex-native tool records. | Could mirror transformed records, but canonical rewrite needs Codex support. |
| Rich native compaction metadata | OpenClaw observes compaction start and completion, but does not receive a stable kept/dropped list, token delta, or summary payload. | Needs richer Codex compaction events. |
| Compaction intervention | Current OpenClaw compaction hooks are notification-level in Codex mode. | Add Codex pre/post compaction hooks if plugins need to veto or rewrite native compaction. |
| Stop or final-answer gating | Codex has native stop hooks, but OpenClaw does not expose final-answer gating as a v1 plugin contract. | Future opt-in primitive with loop and timeout safeguards. |
| Native MCP hook parity | Codex owns MCP execution, and full pre/post hook payload parity depends on Codex MCP handler support. | Add Codex MCP hook payloads, then relay them through the same native hook path. |
| Byte-for-byte model API request capture | OpenClaw can capture app-server requests and notifications, but Codex core builds the final OpenAI API request internally. | Needs a Codex model-request tracing event or debug API. |
## Tools, media, and compaction
The Codex harness changes the low-level embedded agent executor only.
@@ -601,12 +665,15 @@ or disable discovery.
and that the remote app-server speaks the same Codex app-server protocol version.
**A non-Codex model uses PI:** that is expected unless you forced
`embeddedHarness.runtime: "codex"` (or selected a legacy `codex/*` ref). Plain
`openai/gpt-*` and other provider refs stay on their normal provider path.
`embeddedHarness.runtime: "codex"` for that agent or selected a legacy
`codex/*` ref. Plain `openai/gpt-*` and other provider refs stay on their normal
provider path in `auto` mode. If you force `runtime: "codex"`, every embedded
turn for that agent must be a Codex-supported OpenAI model.
## Related
- [Agent Harness Plugins](/plugins/sdk-agent-harness)
- [Agent runtimes](/concepts/agent-runtimes)
- [Model Providers](/concepts/model-providers)
- [Configuration Reference](/gateway/configuration-reference)
- [Testing](/help/testing-live#live-codex-app-server-harness-smoke)

View File

@@ -10,6 +10,7 @@ read_when:
An **agent harness** is the low level executor for one prepared OpenClaw agent
turn. It is not a model provider, not a channel, and not a tool registry.
For the user-facing mental model, see [Agent runtimes](/concepts/agent-runtimes).
Use this surface only for bundled or trusted native plugins. The contract is
still experimental because the parameter types intentionally mirror the current
@@ -123,9 +124,10 @@ OpenClaw. The harness then claims that provider in `supports(...)`.
The bundled Codex plugin follows this pattern:
- provider id: `codex`
- user model refs: `openai/gpt-5.5` plus `embeddedHarness.runtime: "codex"`;
legacy `codex/gpt-*` refs remain accepted for compatibility
- preferred user model refs: `openai/gpt-5.5` plus
`embeddedHarness.runtime: "codex"`
- compatibility refs: legacy `codex/gpt-*` refs remain accepted, but new
configs should not use them as normal provider/model refs
- harness id: `codex`
- auth: synthetic provider availability, because the Codex harness owns the
native Codex login/session
@@ -170,10 +172,11 @@ model refs remain compatibility aliases for the native harness.
When this mode runs, Codex owns the native thread id, resume behavior,
compaction, and app-server execution. OpenClaw still owns the chat channel,
visible transcript mirror, tool policy, approvals, media delivery, and session
selection. Use `embeddedHarness.runtime: "codex"` with
`embeddedHarness.fallback: "none"` when you need to prove that only the Codex
app-server path can claim the run. That config is only a selection guard:
Codex app-server failures already fail directly instead of retrying through PI.
selection. Use `embeddedHarness.runtime: "codex"` without a `fallback` override
when you need to prove that only the Codex app-server path can claim the run.
Explicit plugin runtimes already fail closed by default. Set `fallback: "pi"`
only when you intentionally want PI to handle missing harness selection. Codex
app-server failures already fail directly instead of retrying through PI.
## Disable PI fallback
@@ -182,9 +185,12 @@ set to `{ runtime: "auto", fallback: "pi" }`. In `auto` mode, registered plugin
harnesses can claim a provider/model pair. If none match, OpenClaw falls back
to PI.
Set `fallback: "none"` when you need missing plugin harness selection to fail
instead of using PI. Selected plugin harness failures already fail hard. This
does not block an explicit `runtime: "pi"` or `OPENCLAW_AGENT_RUNTIME=pi`.
In `auto` mode, set `fallback: "none"` when you need missing plugin harness
selection to fail instead of using PI. Explicit plugin runtimes such as
`runtime: "codex"` already fail closed by default, unless `fallback: "pi"` is
set in the same config or environment override scope. Selected plugin harness
failures always fail hard. This does not block an explicit `runtime: "pi"` or
`OPENCLAW_AGENT_RUNTIME=pi`.
For Codex-only embedded runs:
@@ -194,8 +200,7 @@ For Codex-only embedded runs:
"defaults": {
"model": "openai/gpt-5.5",
"embeddedHarness": {
"runtime": "codex",
"fallback": "none"
"runtime": "codex"
}
}
}