mirror of
https://github.com/openclaw/openclaw.git
synced 2026-06-02 01:04:56 +00:00
* refactor: extract agent core package Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts. * refactor: extract shared llm runtime Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout. * refactor: remove pi runtime internals Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code. * refactor: tighten agent session runtime Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts. * refactor: remove static model and pi auth paths Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities. * refactor: remove legacy provider compat paths * docs: remove agent parity notes * fix: skip provider wildcard metadata parsing * refactor: share session extension sdk loading * refactor: inline acpx proxy error formatter * refactor: fold edit recovery into edit tool * fix: accept extension batch separator * test: align startup provider plugin expectations * fix: restore provider-scoped release discovery * test: align static asset packaging expectations * fix: run static provider catalogs during scoped discovery * fix: add provider entry catalogs for scoped live discovery * fix: load lightweight provider catalog entries * fix: refresh provider-scoped plugin metadata * fix: keep provider catalog entries on release live path * fix: keep static manifest models in release live checks * fix: harden release model discovery * fix: reduce OpenAI live cache probe reasoning * fix: disable OpenAI cache probe reasoning * ci: extend OpenAI gateway live timeout * fix: extend live gateway model budget * fix: stabilize release validation regressions * fix: honor provider aliases in model rows * fix: stabilize release validation lanes * fix: stabilize release memory qa * ci: stabilize release validation lanes * ci: prefer ipv4 for live docker node calls * fix: restore shared tool-call stream wrapper * ci: remove legacy pi test shard alias * fix: clean up embedded agent test drift * fix: stabilize runtime alias status * fix: clean up embedded agent ci drift * fix: restore release ci invariants * fix: clean up post-rebase runtime drift * fix: restore release ci checks * fix: restore release ci after rebase * fix: remove stale pi runtime path * test: align compaction runtime expectations * test: update plugin prerelease expectations * fix: handle claude live tool approvals * fix: stabilize release validation gates * fix: finish agent runtime import * test: finish post-rebase agent runtime mocks * fix: keep codex compaction native * fix: stabilize codex app-server hook tests * test: isolate codex diagnostic active run * test: remove codex diagnostic completion race # Conflicts: # extensions/codex/src/app-server/run-attempt.test.ts * ci: fix full release manifest performance run id * refactor: narrow llm plugin sdk boundary * chore: drop generated google boundary stamps * fix: repair rebase fallout * fix: clean up rebased runtime references * fix: decode codex jwt payloads as base64url * fix: preserve shipped pi runtime alias * fix: add scoped sdk virtual modules * fix: decode llm codex oauth jwt as base64url * fix: avoid stale vertex adc negative cache * fix: harden tool arg decoding and codeql path * fix: keep vertex adc negative checks live * refactor: consolidate codex jwt and edit helpers * fix: await codex oauth node runtime imports * fix: preserve sdk tool and notice contracts * fix: preserve shipped compat config boundaries * fix: align codex oauth callback host * fix: terminate agent-core loop streams on failure * fix: keep codex oauth callback alive during fallback * ci: include session tools in critical codeql scans * fix: keep Cloudflare Anthropic provider auth header * docs: redirect legacy pi runtime pages * fix: honor bundled web provider compat discovery * fix: protect session output spill files * fix: keep legacy agent dir env blocked * fix: contain auto-discovered skill symlinks * fix: harden agent core sdk proxy surfaces * fix: restore approval reaction sdk compat * fix: keep live docker runs bounded * fix: keep codex oauth redirect host aligned * fix: resolve post-rebase agent runtime drift * fix: redact anthropic oauth parse failures * fix: preserve responses strict tool shaping * fix: repair agent runtime rebase cleanup * docs: redirect retired parity pages * fix: bound auto-discovered resources to roots * fix: repair post-rebase agent test drift * fix: preserve bundled provider allowlist migration * fix: preserve manifest-owned provider aliases * fix: declare photon image dependency * fix: keep provider headers out of proxy body * fix: preserve shipped env aliases * fix: refresh control ui i18n generated state * fix: quote read fallback paths * fix: preview edits through configured backend * test: satisfy core test typecheck * fix: preserve ZAI usage auth fallback * test: repair codex diagnostic test * fix: repair agent runtime rebase drift * test: finish embedded runner import rename * fix: repair agent runtime rebase integrations * test: align compaction oauth fallback expectations * fix: allow sdk-auth session models * fix: update doctor tool schema import * fix: preserve bedrock plugin region * fix: stream harmony-like prose immediately * ci: include session runtime in codeql shards * fix: repair latest rebase integrations * fix: honor explicit codex websocket transport * fix: keep openai-compatible credentials provider-scoped * fix: refresh sdk api baseline after rebase * fix: route cli runtime aliases through openclaw harness * test: rename stale harness mock expectation * test: rename embedded agent overflow calls * test: clean embedded auth test wording * test: use openclaw stream types in deepinfra cache test * fix: refresh sdk api baseline on latest main * fix: honor bundled discovery compat allowlists * fix: refresh sdk api baseline after latest rebase * fix: remove stale rebase imports * test: rename stale model catalog mock * test: mock renamed doctor runtime modules * fix: map canonical kimi env auth * fix: use internal model registry in bench script * fix: migrate deepinfra provider catalog entry * fix: enforce builtin tool suppression * fix: route compaction auth and proxy payloads safely * refactor: prune unused llm registry leftovers * test: update codex hooks session import * test: fix model picker ci coverage * test: align model picker auth mock types
339 lines
12 KiB
Markdown
339 lines
12 KiB
Markdown
---
|
|
summary: "Experimental SDK surface for plugins that replace the low level embedded agent executor"
|
|
title: "Agent harness plugins"
|
|
sidebarTitle: "Agent Harness"
|
|
read_when:
|
|
- You are changing the embedded agent runtime or harness registry
|
|
- You are registering an agent harness from a bundled or trusted plugin
|
|
- You need to understand how the Codex plugin relates to model providers
|
|
---
|
|
|
|
An **agent harness** is the low level executor for one prepared OpenClaw agent
|
|
turn. It is not a model provider, not a channel, and not a tool registry.
|
|
For the user-facing mental model, see [Agent runtimes](/concepts/agent-runtimes).
|
|
|
|
Use this surface only for bundled or trusted native plugins. The contract is
|
|
still experimental because the parameter types intentionally mirror the current
|
|
embedded runner.
|
|
|
|
## When to use a harness
|
|
|
|
Register an agent harness when a model family has its own native session
|
|
runtime and the normal OpenClaw provider transport is the wrong abstraction.
|
|
|
|
Examples:
|
|
|
|
- a native coding-agent server that owns threads and compaction
|
|
- a local CLI or daemon that must stream native plan/reasoning/tool events
|
|
- a model runtime that needs its own resume id in addition to the OpenClaw
|
|
session transcript
|
|
|
|
Do **not** register a harness just to add a new LLM API. For normal HTTP or
|
|
WebSocket model APIs, build a [provider plugin](/plugins/sdk-provider-plugins).
|
|
|
|
## What core still owns
|
|
|
|
Before a harness is selected, OpenClaw has already resolved:
|
|
|
|
- provider and model
|
|
- runtime auth state
|
|
- thinking level and context budget
|
|
- the OpenClaw transcript/session file
|
|
- workspace, sandbox, and tool policy
|
|
- channel reply callbacks and streaming callbacks
|
|
- model fallback and live model switching policy
|
|
|
|
That split is intentional. A harness runs a prepared attempt; it does not pick
|
|
providers, replace channel delivery, or silently switch models.
|
|
|
|
The prepared attempt also includes `params.runtimePlan`, an OpenClaw-owned
|
|
policy bundle for runtime decisions that must stay shared across OpenClaw and native
|
|
harnesses:
|
|
|
|
- `runtimePlan.tools.normalize(...)` and
|
|
`runtimePlan.tools.logDiagnostics(...)` for provider-aware tool schema policy
|
|
- `runtimePlan.transcript.resolvePolicy(...)` for transcript sanitization and
|
|
tool-call repair policy
|
|
- `runtimePlan.delivery.isSilentPayload(...)` for shared `NO_REPLY` and media
|
|
delivery suppression
|
|
- `runtimePlan.outcome.classifyRunResult(...)` for model fallback classification
|
|
- `runtimePlan.observability` for resolved provider/model/harness metadata
|
|
|
|
Harnesses may use the plan for decisions that need to match OpenClaw behavior, but
|
|
should still treat it as host-owned attempt state. Do not mutate it or use it to
|
|
switch providers/models inside a turn.
|
|
|
|
## Register a harness
|
|
|
|
**Import:** `openclaw/plugin-sdk/agent-harness`
|
|
|
|
```typescript
|
|
import type { AgentHarness } from "openclaw/plugin-sdk/agent-harness";
|
|
import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
|
|
|
|
const myHarness: AgentHarness = {
|
|
id: "my-harness",
|
|
label: "My native agent harness",
|
|
|
|
supports(ctx) {
|
|
return ctx.provider === "my-provider"
|
|
? { supported: true, priority: 100 }
|
|
: { supported: false };
|
|
},
|
|
|
|
async runAttempt(params) {
|
|
// Start or resume your native thread.
|
|
// Use params.prompt, params.tools, params.images, params.onPartialReply,
|
|
// params.onAgentEvent, and the other prepared attempt fields.
|
|
return await runMyNativeTurn(params);
|
|
},
|
|
};
|
|
|
|
export default definePluginEntry({
|
|
id: "my-native-agent",
|
|
name: "My Native Agent",
|
|
description: "Runs selected models through a native agent daemon.",
|
|
register(api) {
|
|
api.registerAgentHarness(myHarness);
|
|
},
|
|
});
|
|
```
|
|
|
|
## Selection policy
|
|
|
|
OpenClaw chooses a harness after provider/model resolution:
|
|
|
|
1. Model-scoped runtime policy wins.
|
|
2. Provider-scoped runtime policy comes next.
|
|
3. `auto` asks registered harnesses if they support the resolved
|
|
provider/model.
|
|
4. If no registered harness matches, OpenClaw uses its embedded runtime.
|
|
|
|
Plugin harness failures surface as run failures. In `auto` mode, embedded fallback is
|
|
only used when no registered plugin harness supports the resolved
|
|
provider/model. Once a plugin harness has claimed a run, OpenClaw does not
|
|
replay that same turn through another runtime because that can change
|
|
auth/runtime semantics or duplicate side effects.
|
|
|
|
Whole-session and whole-agent runtime pins are ignored by selection. That
|
|
includes stale session `agentHarnessId` values, `agents.defaults.agentRuntime`,
|
|
`agents.list[].agentRuntime`, and `OPENCLAW_AGENT_RUNTIME`. `/status` shows the
|
|
effective runtime selected from the provider/model route.
|
|
If the selected harness is surprising, enable `agents/harness` debug logging and
|
|
inspect the gateway's structured `agent harness selected` record. It includes
|
|
the selected harness id, selection reason, runtime/fallback policy, and, in
|
|
`auto` mode, each plugin candidate's support result.
|
|
|
|
The bundled Codex plugin registers `codex` as its harness id. Core treats that
|
|
as an ordinary plugin harness id; Codex-specific aliases belong in the plugin
|
|
or operator config, not in the shared runtime selector.
|
|
|
|
## Provider plus harness pairing
|
|
|
|
Most harnesses should also register a provider. The provider makes model refs,
|
|
auth status, model metadata, and `/model` selection visible to the rest of
|
|
OpenClaw. The harness then claims that provider in `supports(...)`.
|
|
|
|
The bundled Codex plugin follows this pattern:
|
|
|
|
- preferred user model refs: `openai/gpt-5.5`
|
|
- compatibility refs: legacy `codex/gpt-*` refs remain accepted, but new
|
|
configs should not use them as normal provider/model refs
|
|
- harness id: `codex`
|
|
- auth: synthetic provider availability, because the Codex harness owns the
|
|
native Codex login/session
|
|
- app-server request: OpenClaw sends the bare model id to Codex and lets the
|
|
harness talk to the native app-server protocol
|
|
|
|
The Codex plugin is additive. Plain `openai/gpt-*` agent refs on the official
|
|
OpenAI provider select the Codex harness by default. Older `codex/gpt-*` refs
|
|
still select the Codex provider and harness for compatibility.
|
|
|
|
For operator setup, model prefix examples, and Codex-only configs, see
|
|
[Codex Harness](/plugins/codex-harness).
|
|
|
|
OpenClaw requires Codex app-server `0.125.0` or newer. The Codex plugin checks
|
|
the app-server initialize handshake and blocks older or unversioned servers so
|
|
OpenClaw only runs against the protocol surface it has been tested with. The
|
|
`0.125.0` floor includes the native MCP hook payload support that landed in
|
|
Codex `0.124.0`, while pinning OpenClaw to the newer tested stable line.
|
|
|
|
### Tool-result middleware
|
|
|
|
Bundled plugins can attach runtime-neutral tool-result middleware through
|
|
`api.registerAgentToolResultMiddleware(...)` when their manifest declares the
|
|
targeted runtime ids in `contracts.agentToolResultMiddleware`. This trusted
|
|
seam is for async tool-result transforms that must run before OpenClaw or Codex feeds
|
|
tool output back into the model.
|
|
|
|
Legacy bundled plugins can still use
|
|
`api.registerCodexAppServerExtensionFactory(...)` for Codex app-server-only
|
|
middleware, but new result transforms should use the runtime-neutral API.
|
|
The embedded-runner-only `api.registerEmbeddedExtensionFactory(...)` hook has been removed;
|
|
embedded tool-result transforms must use runtime-neutral middleware.
|
|
|
|
### Terminal outcome classification
|
|
|
|
Native harnesses that own their own protocol projection can use
|
|
`classifyAgentHarnessTerminalOutcome(...)` from
|
|
`openclaw/plugin-sdk/agent-harness-runtime` when a completed turn produced no
|
|
visible assistant text. The helper returns `empty`, `reasoning-only`, or
|
|
`planning-only` so OpenClaw's fallback policy can decide whether to retry on a
|
|
different model. It intentionally leaves prompt errors, in-flight turns, and
|
|
intentional silent replies such as `NO_REPLY` unclassified.
|
|
|
|
### Native Codex harness mode
|
|
|
|
The bundled `codex` harness is the native Codex mode for embedded OpenClaw
|
|
agent turns. Enable the bundled `codex` plugin first, and include `codex` in
|
|
`plugins.allow` if your config uses a restrictive allowlist. Native app-server
|
|
configs should use `openai/gpt-*`; OpenAI agent turns select the Codex harness
|
|
by default. Legacy `openai-codex/*` routes should be repaired with
|
|
`openclaw doctor --fix`, and legacy `codex/*` model refs remain compatibility
|
|
aliases for the native harness.
|
|
|
|
When this mode runs, Codex owns the native thread id, resume behavior,
|
|
compaction, and app-server execution. OpenClaw still owns the chat channel,
|
|
visible transcript mirror, tool policy, approvals, media delivery, and session
|
|
selection. Use provider/model `agentRuntime.id: "codex"` when you need to prove
|
|
that only the Codex app-server path can claim the run. Explicit plugin runtimes
|
|
fail closed; Codex app-server selection failures and runtime failures are not
|
|
retried through another runtime.
|
|
|
|
## Runtime strictness
|
|
|
|
By default, OpenClaw uses `auto` provider/model runtime policy: registered
|
|
plugin harnesses can claim a provider/model pair, and the embedded runtime
|
|
handles the turn when none match. OpenAI agent refs on the official OpenAI provider default to Codex.
|
|
Use an explicit provider/model plugin runtime such as
|
|
`agentRuntime.id: "codex"` when missing harness selection should fail instead
|
|
of routing through the embedded runtime. Selected plugin harness failures always
|
|
fail hard. This does not block an explicit provider/model `agentRuntime.id: "openclaw"`.
|
|
|
|
For Codex-only embedded runs:
|
|
|
|
```json
|
|
{
|
|
"models": {
|
|
"providers": {
|
|
"openai": {
|
|
"agentRuntime": {
|
|
"id": "codex"
|
|
}
|
|
}
|
|
}
|
|
},
|
|
"agents": {
|
|
"defaults": {
|
|
"model": "openai/gpt-5.5"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
If you want a CLI backend for one canonical model, put the runtime on that
|
|
model entry:
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
"model": "anthropic/claude-opus-4-7",
|
|
"models": {
|
|
"anthropic/claude-opus-4-7": {
|
|
"agentRuntime": {
|
|
"id": "claude-cli"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
Per-agent overrides use the same model-scoped shape:
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"list": [
|
|
{
|
|
"id": "codex-only",
|
|
"model": "openai/gpt-5.5",
|
|
"models": {
|
|
"openai/gpt-5.5": {
|
|
"agentRuntime": { "id": "codex" }
|
|
}
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
```
|
|
|
|
Legacy whole-agent runtime examples like this are ignored:
|
|
|
|
```json
|
|
{
|
|
"agents": {
|
|
"defaults": {
|
|
"agentRuntime": {
|
|
"id": "codex"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
With an explicit plugin runtime, a session fails early when the requested
|
|
harness is not registered, does not support the resolved provider/model, or
|
|
fails before producing turn side effects. That is intentional for Codex-only
|
|
deployments and for live tests that must prove the Codex app-server path is
|
|
actually in use.
|
|
|
|
This setting only controls the embedded agent harness. It does not disable
|
|
image, video, music, TTS, PDF, or other provider-specific model routing.
|
|
|
|
## Native sessions and transcript mirror
|
|
|
|
A harness may keep a native session id, thread id, or daemon-side resume token.
|
|
Keep that binding explicitly associated with the OpenClaw session, and keep
|
|
mirroring user-visible assistant/tool output into the OpenClaw transcript.
|
|
|
|
The OpenClaw transcript remains the compatibility layer for:
|
|
|
|
- channel-visible session history
|
|
- transcript search and indexing
|
|
- switching back to the built-in OpenClaw harness on a later turn
|
|
- generic `/new`, `/reset`, and session deletion behavior
|
|
|
|
If your harness stores a sidecar binding, implement `reset(...)` so OpenClaw can
|
|
clear it when the owning OpenClaw session is reset.
|
|
|
|
## Tool and media results
|
|
|
|
Core constructs the OpenClaw tool list and passes it into the prepared attempt.
|
|
When a harness executes a dynamic tool call, return the tool result back through
|
|
the harness result shape instead of sending channel media yourself.
|
|
|
|
This keeps text, image, video, music, TTS, approval, and messaging-tool outputs
|
|
on the same delivery path as OpenClaw-backed runs.
|
|
|
|
## Current limitations
|
|
|
|
- The public import path is generic, but some attempt/result type aliases still
|
|
carry legacy names for compatibility.
|
|
- Third-party harness installation is experimental. Prefer provider plugins
|
|
until you need a native session runtime.
|
|
- Harness switching is supported across turns. Do not switch harnesses in the
|
|
middle of a turn after native tools, approvals, assistant text, or message
|
|
sends have started.
|
|
|
|
## Related
|
|
|
|
- [SDK Overview](/plugins/sdk-overview)
|
|
- [Runtime Helpers](/plugins/sdk-runtime)
|
|
- [Provider Plugins](/plugins/sdk-provider-plugins)
|
|
- [Codex Harness](/plugins/codex-harness)
|
|
- [Model Providers](/concepts/model-providers)
|