docs: document Codex harness plugin workflow

2026-04-22 14:41:34 +00:00 · 2026-04-10 13:52:24 +01:00
parent dd26e8c44d
commit bfc0889776
13 changed files with 844 additions and 18 deletions
--- a/docs/plugins/sdk-agent-harness.md
+++ b/docs/plugins/sdk-agent-harness.md
@@ -0,0 +1,167 @@
+---
+title: "Agent Harness Plugins"
+sidebarTitle: "Agent Harness"
+summary: "Experimental SDK surface for plugins that replace the low level embedded agent executor"
+read_when:
+  - You are changing the embedded agent runtime or harness registry
+  - You are registering an agent harness from a bundled or trusted plugin
+  - You need to understand how the Codex plugin relates to model providers
+---
+
+# Agent Harness Plugins
+
+An **agent harness** is the low level executor for one prepared OpenClaw agent
+turn. It is not a model provider, not a channel, and not a tool registry.
+
+Use this surface only for bundled or trusted native plugins. The contract is
+still experimental because the parameter types intentionally mirror the current
+embedded runner.
+
+## When to use a harness
+
+Register an agent harness when a model family has its own native session
+runtime and the normal OpenClaw provider transport is the wrong abstraction.
+
+Examples:
+
+- a native coding-agent server that owns threads and compaction
+- a local CLI or daemon that must stream native plan/reasoning/tool events
+- a model runtime that needs its own resume id in addition to the OpenClaw
+  session transcript
+
+Do **not** register a harness just to add a new LLM API. For normal HTTP or
+WebSocket model APIs, build a [provider plugin](/plugins/sdk-provider-plugins).
+
+## What core still owns
+
+Before a harness is selected, OpenClaw has already resolved:
+
+- provider and model
+- runtime auth state
+- thinking level and context budget
+- the OpenClaw transcript/session file
+- workspace, sandbox, and tool policy
+- channel reply callbacks and streaming callbacks
+- model fallback and live model switching policy
+
+That split is intentional. A harness runs a prepared attempt; it does not pick
+providers, replace channel delivery, or silently switch models.
+
+## Register a harness
+
+**Import:** `openclaw/plugin-sdk/agent-harness`
+
+```typescript
+import type { AgentHarness } from "openclaw/plugin-sdk/agent-harness";
+import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
+
+const myHarness: AgentHarness = {
+  id: "my-harness",
+  label: "My native agent harness",
+
+  supports(ctx) {
+    return ctx.provider === "my-provider"
+      ? { supported: true, priority: 100 }
+      : { supported: false };
+  },
+
+  async runAttempt(params) {
+    // Start or resume your native thread.
+    // Use params.prompt, params.tools, params.images, params.onPartialReply,
+    // params.onAgentEvent, and the other prepared attempt fields.
+    return await runMyNativeTurn(params);
+  },
+};
+
+export default definePluginEntry({
+  id: "my-native-agent",
+  name: "My Native Agent",
+  description: "Runs selected models through a native agent daemon.",
+  register(api) {
+    api.registerAgentHarness(myHarness);
+  },
+});
+```
+
+## Selection policy
+
+OpenClaw chooses a harness after provider/model resolution:
+
+1. `OPENCLAW_AGENT_RUNTIME=<id>` forces a registered harness with that id.
+2. `OPENCLAW_AGENT_RUNTIME=pi` forces the built-in PI harness.
+3. `OPENCLAW_AGENT_RUNTIME=auto` asks registered harnesses if they support the
+   resolved provider/model.
+4. If no registered harness matches, OpenClaw uses PI.
+
+Forced plugin harness failures surface as run failures. In `auto` mode,
+OpenClaw may fall back to PI when the selected plugin harness fails before a
+turn has produced side effects.
+
+The bundled Codex plugin registers `codex` as its harness id. For compatibility,
+`codex-app-server` and `app-server` also resolve to that same harness when you
+set `OPENCLAW_AGENT_RUNTIME` manually.
+
+## Provider plus harness pairing
+
+Most harnesses should also register a provider. The provider makes model refs,
+auth status, model metadata, and `/model` selection visible to the rest of
+OpenClaw. The harness then claims that provider in `supports(...)`.
+
+The bundled Codex plugin follows this pattern:
+
+- provider id: `codex`
+- user model refs: `codex/gpt-5.4`, `codex/gpt-5.2`, or another model returned
+  by the Codex app server
+- harness id: `codex`
+- auth: synthetic provider availability, because the Codex harness owns the
+  native Codex login/session
+- app-server request: OpenClaw sends the bare model id to Codex and lets the
+  harness talk to the native app-server protocol
+
+The Codex plugin is additive. Plain `openai/gpt-*` refs remain OpenAI provider
+refs and continue to use the normal OpenClaw provider path. Select `codex/gpt-*`
+when you want Codex-managed auth, Codex model discovery, native threads, and
+Codex app-server execution. `/model` can switch among the Codex models returned
+by the Codex app server without requiring OpenAI provider credentials.
+
+## Native sessions and transcript mirror
+
+A harness may keep a native session id, thread id, or daemon-side resume token.
+Keep that binding explicitly associated with the OpenClaw session, and keep
+mirroring user-visible assistant/tool output into the OpenClaw transcript.
+
+The OpenClaw transcript remains the compatibility layer for:
+
+- channel-visible session history
+- transcript search and indexing
+- switching back to the built-in PI harness on a later turn
+- generic `/new`, `/reset`, and session deletion behavior
+
+If your harness stores a sidecar binding, implement `reset(...)` so OpenClaw can
+clear it when the owning OpenClaw session is reset.
+
+## Tool and media results
+
+Core constructs the OpenClaw tool list and passes it into the prepared attempt.
+When a harness executes a dynamic tool call, return the tool result back through
+the harness result shape instead of sending channel media yourself.
+
+This keeps text, image, video, music, TTS, approval, and messaging-tool outputs
+on the same delivery path as PI-backed runs.
+
+## Current limitations
+
+- The public import path is generic, but some attempt/result type aliases still
+  carry `Pi` names for compatibility.
+- Third-party harness installation is experimental. Prefer provider plugins
+  until you need a native session runtime.
+- Harness switching is supported across turns. Do not switch harnesses in the
+  middle of a turn after native tools, approvals, assistant text, or message
+  sends have started.
+
+## Related
+
+- [SDK Overview](/plugins/sdk-overview)
+- [Runtime Helpers](/plugins/sdk-runtime)
+- [Provider Plugins](/plugins/sdk-provider-plugins)
+- [Model Providers](/concepts/model-providers)
--- a/docs/plugins/sdk-overview.md
+++ b/docs/plugins/sdk-overview.md
@@ -219,6 +219,7 @@ explicitly promotes one as public.
    | `plugin-sdk/models-provider-runtime` | `/models` command/provider reply helpers |
    | `plugin-sdk/skill-commands-runtime` | Skill command listing helpers |
    | `plugin-sdk/native-command-registry` | Native command registry/build/serialize helpers |
+    | `plugin-sdk/agent-harness` | Experimental trusted-plugin surface for low-level agent harnesses: harness types, active-run steer/abort helpers, OpenClaw tool bridge helpers, and attempt result utilities |
    | `plugin-sdk/provider-zai-endpoint` | Z.AI endpoint detection helpers |
    | `plugin-sdk/infra-runtime` | System event/heartbeat helpers |
    | `plugin-sdk/collection-runtime` | Small bounded cache helpers |
@@ -302,20 +303,21 @@ methods:

 ### Capability registration

-| Method                                           | What it registers                |
-| ------------------------------------------------ | -------------------------------- |
-| `api.registerProvider(...)`                      | Text inference (LLM)             |
-| `api.registerCliBackend(...)`                    | Local CLI inference backend      |
-| `api.registerChannel(...)`                       | Messaging channel                |
-| `api.registerSpeechProvider(...)`                | Text-to-speech / STT synthesis   |
-| `api.registerRealtimeTranscriptionProvider(...)` | Streaming realtime transcription |
-| `api.registerRealtimeVoiceProvider(...)`         | Duplex realtime voice sessions   |
-| `api.registerMediaUnderstandingProvider(...)`    | Image/audio/video analysis       |
-| `api.registerImageGenerationProvider(...)`       | Image generation                 |
-| `api.registerMusicGenerationProvider(...)`       | Music generation                 |
-| `api.registerVideoGenerationProvider(...)`       | Video generation                 |
-| `api.registerWebFetchProvider(...)`              | Web fetch / scrape provider      |
-| `api.registerWebSearchProvider(...)`             | Web search                       |
+| Method                                           | What it registers                     |
+| ------------------------------------------------ | ------------------------------------- |
+| `api.registerProvider(...)`                      | Text inference (LLM)                  |
+| `api.registerAgentHarness(...)`                  | Experimental low-level agent executor |
+| `api.registerCliBackend(...)`                    | Local CLI inference backend           |
+| `api.registerChannel(...)`                       | Messaging channel                     |
+| `api.registerSpeechProvider(...)`                | Text-to-speech / STT synthesis        |
+| `api.registerRealtimeTranscriptionProvider(...)` | Streaming realtime transcription      |
+| `api.registerRealtimeVoiceProvider(...)`         | Duplex realtime voice sessions        |
+| `api.registerMediaUnderstandingProvider(...)`    | Image/audio/video analysis            |
+| `api.registerImageGenerationProvider(...)`       | Image generation                      |
+| `api.registerMusicGenerationProvider(...)`       | Music generation                      |
+| `api.registerVideoGenerationProvider(...)`       | Video generation                      |
+| `api.registerWebFetchProvider(...)`              | Web fetch / scrape provider           |
+| `api.registerWebSearchProvider(...)`             | Web search                            |

 ### Tools and commands

--- a/docs/plugins/sdk-provider-plugins.md
+++ b/docs/plugins/sdk-provider-plugins.md
@@ -20,6 +20,13 @@ API key auth, and dynamic model resolution.
  structure and manifest setup.
 </Info>

+<Tip>
+  Provider plugins add models to OpenClaw's normal inference loop. If the model
+  must run through a native agent daemon that owns threads, compaction, or tool
+  events, pair the provider with an [agent harness](/plugins/sdk-agent-harness)
+  instead of putting daemon protocol details in core.
+</Tip>
+
 ## Walkthrough

 <Steps>
--- a/docs/plugins/sdk-runtime.md
+++ b/docs/plugins/sdk-runtime.md
@@ -50,9 +50,9 @@ const timeoutMs = api.runtime.agent.resolveAgentTimeoutMs(cfg);
 // Ensure workspace exists
 await api.runtime.agent.ensureAgentWorkspace(cfg);

-// Run an embedded Pi agent
+// Run an embedded agent turn
 const agentDir = api.runtime.agent.resolveAgentDir(cfg);
-const result = await api.runtime.agent.runEmbeddedPiAgent({
+const result = await api.runtime.agent.runEmbeddedAgent({
  sessionId: "my-plugin:task-1",
  runId: crypto.randomUUID(),
  sessionFile: path.join(agentDir, "sessions", "my-plugin-task-1.jsonl"),
@@ -62,6 +62,12 @@ const result = await api.runtime.agent.runEmbeddedPiAgent({
 });
 ```

+`runEmbeddedAgent(...)` is the neutral helper for starting a normal OpenClaw
+agent turn from plugin code. It uses the same provider/model resolution and
+agent-harness selection as channel-triggered replies.
+
+`runEmbeddedPiAgent(...)` remains as a compatibility alias.
+
 **Session store helpers** are under `api.runtime.agent.session`:

 ```typescript