* refactor: extract agent core package Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts. * refactor: extract shared llm runtime Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout. * refactor: remove pi runtime internals Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code. * refactor: tighten agent session runtime Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts. * refactor: remove static model and pi auth paths Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities. * refactor: remove legacy provider compat paths * docs: remove agent parity notes * fix: skip provider wildcard metadata parsing * refactor: share session extension sdk loading * refactor: inline acpx proxy error formatter * refactor: fold edit recovery into edit tool * fix: accept extension batch separator * test: align startup provider plugin expectations * fix: restore provider-scoped release discovery * test: align static asset packaging expectations * fix: run static provider catalogs during scoped discovery * fix: add provider entry catalogs for scoped live discovery * fix: load lightweight provider catalog entries * fix: refresh provider-scoped plugin metadata * fix: keep provider catalog entries on release live path * fix: keep static manifest models in release live checks * fix: harden release model discovery * fix: reduce OpenAI live cache probe reasoning * fix: disable OpenAI cache probe reasoning * ci: extend OpenAI gateway live timeout * fix: extend live gateway model budget * fix: stabilize release validation regressions * fix: honor provider aliases in model rows * fix: stabilize release validation lanes * fix: stabilize release memory qa * ci: stabilize release validation lanes * ci: prefer ipv4 for live docker node calls * fix: restore shared tool-call stream wrapper * ci: remove legacy pi test shard alias * fix: clean up embedded agent test drift * fix: stabilize runtime alias status * fix: clean up embedded agent ci drift * fix: restore release ci invariants * fix: clean up post-rebase runtime drift * fix: restore release ci checks * fix: restore release ci after rebase * fix: remove stale pi runtime path * test: align compaction runtime expectations * test: update plugin prerelease expectations * fix: handle claude live tool approvals * fix: stabilize release validation gates * fix: finish agent runtime import * test: finish post-rebase agent runtime mocks * fix: keep codex compaction native * fix: stabilize codex app-server hook tests * test: isolate codex diagnostic active run * test: remove codex diagnostic completion race # Conflicts: # extensions/codex/src/app-server/run-attempt.test.ts * ci: fix full release manifest performance run id * refactor: narrow llm plugin sdk boundary * chore: drop generated google boundary stamps * fix: repair rebase fallout * fix: clean up rebased runtime references * fix: decode codex jwt payloads as base64url * fix: preserve shipped pi runtime alias * fix: add scoped sdk virtual modules * fix: decode llm codex oauth jwt as base64url * fix: avoid stale vertex adc negative cache * fix: harden tool arg decoding and codeql path * fix: keep vertex adc negative checks live * refactor: consolidate codex jwt and edit helpers * fix: await codex oauth node runtime imports * fix: preserve sdk tool and notice contracts * fix: preserve shipped compat config boundaries * fix: align codex oauth callback host * fix: terminate agent-core loop streams on failure * fix: keep codex oauth callback alive during fallback * ci: include session tools in critical codeql scans * fix: keep Cloudflare Anthropic provider auth header * docs: redirect legacy pi runtime pages * fix: honor bundled web provider compat discovery * fix: protect session output spill files * fix: keep legacy agent dir env blocked * fix: contain auto-discovered skill symlinks * fix: harden agent core sdk proxy surfaces * fix: restore approval reaction sdk compat * fix: keep live docker runs bounded * fix: keep codex oauth redirect host aligned * fix: resolve post-rebase agent runtime drift * fix: redact anthropic oauth parse failures * fix: preserve responses strict tool shaping * fix: repair agent runtime rebase cleanup * docs: redirect retired parity pages * fix: bound auto-discovered resources to roots * fix: repair post-rebase agent test drift * fix: preserve bundled provider allowlist migration * fix: preserve manifest-owned provider aliases * fix: declare photon image dependency * fix: keep provider headers out of proxy body * fix: preserve shipped env aliases * fix: refresh control ui i18n generated state * fix: quote read fallback paths * fix: preview edits through configured backend * test: satisfy core test typecheck * fix: preserve ZAI usage auth fallback * test: repair codex diagnostic test * fix: repair agent runtime rebase drift * test: finish embedded runner import rename * fix: repair agent runtime rebase integrations * test: align compaction oauth fallback expectations * fix: allow sdk-auth session models * fix: update doctor tool schema import * fix: preserve bedrock plugin region * fix: stream harmony-like prose immediately * ci: include session runtime in codeql shards * fix: repair latest rebase integrations * fix: honor explicit codex websocket transport * fix: keep openai-compatible credentials provider-scoped * fix: refresh sdk api baseline after rebase * fix: route cli runtime aliases through openclaw harness * test: rename stale harness mock expectation * test: rename embedded agent overflow calls * test: clean embedded auth test wording * test: use openclaw stream types in deepinfra cache test * fix: refresh sdk api baseline on latest main * fix: honor bundled discovery compat allowlists * fix: refresh sdk api baseline after latest rebase * fix: remove stale rebase imports * test: rename stale model catalog mock * test: mock renamed doctor runtime modules * fix: map canonical kimi env auth * fix: use internal model registry in bench script * fix: migrate deepinfra provider catalog entry * fix: enforce builtin tool suppression * fix: route compaction auth and proxy payloads safely * refactor: prune unused llm registry leftovers * test: update codex hooks session import * test: fix model picker ci coverage * test: align model picker auth mock types
12 KiB
summary, title, sidebarTitle, read_when
| summary | title | sidebarTitle | read_when | |||
|---|---|---|---|---|---|---|
| Experimental SDK surface for plugins that replace the low level embedded agent executor | Agent harness plugins | Agent Harness |
|
An agent harness is the low level executor for one prepared OpenClaw agent turn. It is not a model provider, not a channel, and not a tool registry. For the user-facing mental model, see Agent runtimes.
Use this surface only for bundled or trusted native plugins. The contract is still experimental because the parameter types intentionally mirror the current embedded runner.
When to use a harness
Register an agent harness when a model family has its own native session runtime and the normal OpenClaw provider transport is the wrong abstraction.
Examples:
- a native coding-agent server that owns threads and compaction
- a local CLI or daemon that must stream native plan/reasoning/tool events
- a model runtime that needs its own resume id in addition to the OpenClaw session transcript
Do not register a harness just to add a new LLM API. For normal HTTP or WebSocket model APIs, build a provider plugin.
What core still owns
Before a harness is selected, OpenClaw has already resolved:
- provider and model
- runtime auth state
- thinking level and context budget
- the OpenClaw transcript/session file
- workspace, sandbox, and tool policy
- channel reply callbacks and streaming callbacks
- model fallback and live model switching policy
That split is intentional. A harness runs a prepared attempt; it does not pick providers, replace channel delivery, or silently switch models.
The prepared attempt also includes params.runtimePlan, an OpenClaw-owned
policy bundle for runtime decisions that must stay shared across OpenClaw and native
harnesses:
runtimePlan.tools.normalize(...)andruntimePlan.tools.logDiagnostics(...)for provider-aware tool schema policyruntimePlan.transcript.resolvePolicy(...)for transcript sanitization and tool-call repair policyruntimePlan.delivery.isSilentPayload(...)for sharedNO_REPLYand media delivery suppressionruntimePlan.outcome.classifyRunResult(...)for model fallback classificationruntimePlan.observabilityfor resolved provider/model/harness metadata
Harnesses may use the plan for decisions that need to match OpenClaw behavior, but should still treat it as host-owned attempt state. Do not mutate it or use it to switch providers/models inside a turn.
Register a harness
Import: openclaw/plugin-sdk/agent-harness
import type { AgentHarness } from "openclaw/plugin-sdk/agent-harness";
import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
const myHarness: AgentHarness = {
id: "my-harness",
label: "My native agent harness",
supports(ctx) {
return ctx.provider === "my-provider"
? { supported: true, priority: 100 }
: { supported: false };
},
async runAttempt(params) {
// Start or resume your native thread.
// Use params.prompt, params.tools, params.images, params.onPartialReply,
// params.onAgentEvent, and the other prepared attempt fields.
return await runMyNativeTurn(params);
},
};
export default definePluginEntry({
id: "my-native-agent",
name: "My Native Agent",
description: "Runs selected models through a native agent daemon.",
register(api) {
api.registerAgentHarness(myHarness);
},
});
Selection policy
OpenClaw chooses a harness after provider/model resolution:
- Model-scoped runtime policy wins.
- Provider-scoped runtime policy comes next.
autoasks registered harnesses if they support the resolved provider/model.- If no registered harness matches, OpenClaw uses its embedded runtime.
Plugin harness failures surface as run failures. In auto mode, embedded fallback is
only used when no registered plugin harness supports the resolved
provider/model. Once a plugin harness has claimed a run, OpenClaw does not
replay that same turn through another runtime because that can change
auth/runtime semantics or duplicate side effects.
Whole-session and whole-agent runtime pins are ignored by selection. That
includes stale session agentHarnessId values, agents.defaults.agentRuntime,
agents.list[].agentRuntime, and OPENCLAW_AGENT_RUNTIME. /status shows the
effective runtime selected from the provider/model route.
If the selected harness is surprising, enable agents/harness debug logging and
inspect the gateway's structured agent harness selected record. It includes
the selected harness id, selection reason, runtime/fallback policy, and, in
auto mode, each plugin candidate's support result.
The bundled Codex plugin registers codex as its harness id. Core treats that
as an ordinary plugin harness id; Codex-specific aliases belong in the plugin
or operator config, not in the shared runtime selector.
Provider plus harness pairing
Most harnesses should also register a provider. The provider makes model refs,
auth status, model metadata, and /model selection visible to the rest of
OpenClaw. The harness then claims that provider in supports(...).
The bundled Codex plugin follows this pattern:
- preferred user model refs:
openai/gpt-5.5 - compatibility refs: legacy
codex/gpt-*refs remain accepted, but new configs should not use them as normal provider/model refs - harness id:
codex - auth: synthetic provider availability, because the Codex harness owns the native Codex login/session
- app-server request: OpenClaw sends the bare model id to Codex and lets the harness talk to the native app-server protocol
The Codex plugin is additive. Plain openai/gpt-* agent refs on the official
OpenAI provider select the Codex harness by default. Older codex/gpt-* refs
still select the Codex provider and harness for compatibility.
For operator setup, model prefix examples, and Codex-only configs, see Codex Harness.
OpenClaw requires Codex app-server 0.125.0 or newer. The Codex plugin checks
the app-server initialize handshake and blocks older or unversioned servers so
OpenClaw only runs against the protocol surface it has been tested with. The
0.125.0 floor includes the native MCP hook payload support that landed in
Codex 0.124.0, while pinning OpenClaw to the newer tested stable line.
Tool-result middleware
Bundled plugins can attach runtime-neutral tool-result middleware through
api.registerAgentToolResultMiddleware(...) when their manifest declares the
targeted runtime ids in contracts.agentToolResultMiddleware. This trusted
seam is for async tool-result transforms that must run before OpenClaw or Codex feeds
tool output back into the model.
Legacy bundled plugins can still use
api.registerCodexAppServerExtensionFactory(...) for Codex app-server-only
middleware, but new result transforms should use the runtime-neutral API.
The embedded-runner-only api.registerEmbeddedExtensionFactory(...) hook has been removed;
embedded tool-result transforms must use runtime-neutral middleware.
Terminal outcome classification
Native harnesses that own their own protocol projection can use
classifyAgentHarnessTerminalOutcome(...) from
openclaw/plugin-sdk/agent-harness-runtime when a completed turn produced no
visible assistant text. The helper returns empty, reasoning-only, or
planning-only so OpenClaw's fallback policy can decide whether to retry on a
different model. It intentionally leaves prompt errors, in-flight turns, and
intentional silent replies such as NO_REPLY unclassified.
Native Codex harness mode
The bundled codex harness is the native Codex mode for embedded OpenClaw
agent turns. Enable the bundled codex plugin first, and include codex in
plugins.allow if your config uses a restrictive allowlist. Native app-server
configs should use openai/gpt-*; OpenAI agent turns select the Codex harness
by default. Legacy openai-codex/* routes should be repaired with
openclaw doctor --fix, and legacy codex/* model refs remain compatibility
aliases for the native harness.
When this mode runs, Codex owns the native thread id, resume behavior,
compaction, and app-server execution. OpenClaw still owns the chat channel,
visible transcript mirror, tool policy, approvals, media delivery, and session
selection. Use provider/model agentRuntime.id: "codex" when you need to prove
that only the Codex app-server path can claim the run. Explicit plugin runtimes
fail closed; Codex app-server selection failures and runtime failures are not
retried through another runtime.
Runtime strictness
By default, OpenClaw uses auto provider/model runtime policy: registered
plugin harnesses can claim a provider/model pair, and the embedded runtime
handles the turn when none match. OpenAI agent refs on the official OpenAI provider default to Codex.
Use an explicit provider/model plugin runtime such as
agentRuntime.id: "codex" when missing harness selection should fail instead
of routing through the embedded runtime. Selected plugin harness failures always
fail hard. This does not block an explicit provider/model agentRuntime.id: "openclaw".
For Codex-only embedded runs:
{
"models": {
"providers": {
"openai": {
"agentRuntime": {
"id": "codex"
}
}
}
},
"agents": {
"defaults": {
"model": "openai/gpt-5.5"
}
}
}
If you want a CLI backend for one canonical model, put the runtime on that model entry:
{
"agents": {
"defaults": {
"model": "anthropic/claude-opus-4-7",
"models": {
"anthropic/claude-opus-4-7": {
"agentRuntime": {
"id": "claude-cli"
}
}
}
}
}
}
Per-agent overrides use the same model-scoped shape:
{
"agents": {
"list": [
{
"id": "codex-only",
"model": "openai/gpt-5.5",
"models": {
"openai/gpt-5.5": {
"agentRuntime": { "id": "codex" }
}
}
}
]
}
}
Legacy whole-agent runtime examples like this are ignored:
{
"agents": {
"defaults": {
"agentRuntime": {
"id": "codex"
}
}
}
}
With an explicit plugin runtime, a session fails early when the requested harness is not registered, does not support the resolved provider/model, or fails before producing turn side effects. That is intentional for Codex-only deployments and for live tests that must prove the Codex app-server path is actually in use.
This setting only controls the embedded agent harness. It does not disable image, video, music, TTS, PDF, or other provider-specific model routing.
Native sessions and transcript mirror
A harness may keep a native session id, thread id, or daemon-side resume token. Keep that binding explicitly associated with the OpenClaw session, and keep mirroring user-visible assistant/tool output into the OpenClaw transcript.
The OpenClaw transcript remains the compatibility layer for:
- channel-visible session history
- transcript search and indexing
- switching back to the built-in OpenClaw harness on a later turn
- generic
/new,/reset, and session deletion behavior
If your harness stores a sidecar binding, implement reset(...) so OpenClaw can
clear it when the owning OpenClaw session is reset.
Tool and media results
Core constructs the OpenClaw tool list and passes it into the prepared attempt. When a harness executes a dynamic tool call, return the tool result back through the harness result shape instead of sending channel media yourself.
This keeps text, image, video, music, TTS, approval, and messaging-tool outputs on the same delivery path as OpenClaw-backed runs.
Current limitations
- The public import path is generic, but some attempt/result type aliases still carry legacy names for compatibility.
- Third-party harness installation is experimental. Prefer provider plugins until you need a native session runtime.
- Harness switching is supported across turns. Do not switch harnesses in the middle of a turn after native tools, approvals, assistant text, or message sends have started.