fix(models): normalize provider runtime selection (#71259)

* fix(models): normalize provider runtime selection

* fix(models): reverse codex-only runtime migration

* fix(models): default runtime selection to pi

* fix(status): label model runtime clearly

* fix(status): align pi runtime label

* fix(plugins): align tool result middleware runtime naming

* fix(models): validate runtime overrides
This commit is contained in:
Vincent Koc
2026-04-24 16:56:49 -07:00
committed by GitHub
parent 60e7b692cc
commit aa27e27f36
75 changed files with 1422 additions and 414 deletions

View File

@@ -1,2 +1,2 @@
eb5c790aaa54be7b1380eb5a162db50dd314e052aedb5e608290092c33d999f2 plugin-sdk-api-baseline.json
0d2fd80f69e0c3488b6bdbbbb035b08ab108637790d1f30b8e4f84c71c5bc8e2 plugin-sdk-api-baseline.jsonl
f74435d49aa0af2509264d8581e12ffc624b1d6542d250d608ee5c3b41a234f3 plugin-sdk-api-baseline.json
df33bbe47bb092ed11814576b5386253140f7aa6f8479a5334aff9b988125afc plugin-sdk-api-baseline.jsonl

View File

@@ -305,7 +305,7 @@ By default, components are single use. Set `components.reusable=true` to allow b
To restrict who can click a button, set `allowedUsers` on that button (Discord user IDs, tags, or `*`). When configured, unmatched users receive an ephemeral denial.
The `/model` and `/models` slash commands open an interactive model picker with provider and model dropdowns plus a Submit step. `/models add` is deprecated and now returns a deprecation message instead of registering models from chat. The picker reply is ephemeral and only the invoking user can use it.
The `/model` and `/models` slash commands open an interactive model picker with provider, model, and compatible runtime dropdowns plus a Submit step. `/models add` is deprecated and now returns a deprecation message instead of registering models from chat. The picker reply is ephemeral and only the invoking user can use it.
File attachments:

View File

@@ -21,7 +21,7 @@ Notes:
- `--deep` runs live probes (WhatsApp Web + Telegram + Discord + Slack + Signal).
- `--usage` prints normalized provider usage windows as `X% left`.
- Session status output now separates `Runtime:` from `Runner:`. `Runtime` is the execution path and sandbox state (`direct`, `docker/*`), while `Runner` tells you whether the session is using embedded Pi, a CLI-backed provider, or an ACP harness backend such as `codex (acp/acpx)`.
- Session status output separates `Execution:` from `Runtime:`. `Execution` is the sandbox path (`direct`, `docker/*`), while `Runtime` tells you whether the session is using `OpenClaw Pi Default`, `OpenAI Codex`, a CLI backend, or an ACP backend such as `codex (acp/acpx)`.
- MiniMax's raw `usage_percent` / `usagePercent` fields are remaining quota, so OpenClaw inverts them before display; count-based fields win when present. `model_remains` responses prefer the chat-model entry, derive the window label from timestamps when needed, and include the model name in the plan label.
- When the current session snapshot is sparse, `/status` can backfill token and cache counters from the most recent transcript usage log. Existing nonzero live values still win over transcript fallback values.
- Transcript fallback can also recover the active runtime model label when the live session entry is missing it. If that transcript model differs from the selected model, status resolves the context window against the recovered runtime model instead of the selected one.

View File

@@ -24,6 +24,12 @@ For model selection rules, see [/concepts/models](/concepts/models).
- Plugin auto-enable follows that same boundary: `openai-codex/<model>` belongs
to the OpenAI plugin, while the Codex plugin is enabled by
`embeddedHarness.runtime: "codex"` or legacy `codex/<model>` refs.
- CLI runtimes use the same split: choose canonical model refs such as
`anthropic/claude-*`, `google/gemini-*`, or `openai/gpt-*`, then set
`agents.defaults.embeddedHarness.runtime` to `claude-cli`,
`google-gemini-cli`, or `codex-cli` when you want a local CLI backend.
Legacy `claude-cli/*`, `google-gemini-cli/*`, and `codex-cli/*` refs migrate
back to canonical provider refs with the runtime recorded separately.
- GPT-5.5 is currently available through subscription/OAuth routes:
`openai-codex/gpt-5.5` in PI or `openai/gpt-5.5` with the Codex app-server
harness. The direct API-key route for `openai/gpt-5.5` is supported once

View File

@@ -316,7 +316,7 @@ Time format in system prompt. Default: `auto` (OS preference).
},
params: { cacheRetention: "long" }, // global default provider params
embeddedHarness: {
runtime: "auto", // auto | pi | registered harness id, e.g. codex
runtime: "pi", // pi | auto | registered harness id, e.g. codex
fallback: "pi", // pi | none
},
pdfMaxBytesMb: 10,
@@ -369,14 +369,14 @@ Time format in system prompt. Default: `auto` (OS preference).
- For direct OpenAI Responses models, server-side compaction is enabled automatically. Use `params.responsesServerCompaction: false` to stop injecting `context_management`, or `params.responsesCompactThreshold` to override the threshold. See [OpenAI server-side compaction](/providers/openai#server-side-compaction-responses-api).
- `params`: global default provider parameters applied to all models. Set at `agents.defaults.params` (e.g. `{ cacheRetention: "long" }`).
- `params` merge precedence (config): `agents.defaults.params` (global base) is overridden by `agents.defaults.models["provider/model"].params` (per-model), then `agents.list[].params` (matching agent id) overrides by key. See [Prompt Caching](/reference/prompt-caching) for details.
- `embeddedHarness`: default low-level embedded agent runtime policy. Use `runtime: "auto"` to let registered plugin harnesses claim supported models, `runtime: "pi"` to force the built-in PI harness, or a registered harness id such as `runtime: "codex"`. Automatic PI fallback defaults to `"pi"` only in `auto` mode. Explicit plugin runtimes such as `codex` default to `"none"` unless you set `fallback: "pi"`. New Codex harness configs should keep model refs canonical as `openai/*` and select the harness here rather than using legacy `codex/*` model refs.
- `embeddedHarness`: default low-level embedded agent runtime policy. Omitted runtime defaults to OpenClaw Pi. Use `runtime: "pi"` to force the built-in PI harness, `runtime: "auto"` to let registered plugin harnesses claim supported models, or a registered harness id such as `runtime: "codex"`. Set `fallback: "none"` to disable automatic PI fallback. Keep model refs canonical as `provider/model`; select Codex, Claude CLI, Gemini CLI, and other execution backends through runtime config instead of legacy runtime provider prefixes.
- Config writers that mutate these fields (for example `/models set`, `/models set-image`, and fallback add/remove commands) save canonical object form and preserve existing fallback lists when possible.
- `maxConcurrent`: max parallel agent runs across sessions (each session still serialized). Default: 4.
### `agents.defaults.embeddedHarness`
`embeddedHarness` controls which low-level executor runs embedded agent turns.
Most deployments should keep the default `{ runtime: "auto", fallback: "pi" }`.
Most deployments should keep the default OpenClaw Pi runtime.
Use it when a trusted plugin provides a native harness, such as the bundled
Codex app-server harness.

View File

@@ -172,8 +172,8 @@ For the full registration API, see [SDK Overview](/plugins/sdk-overview#registra
Bundled plugins can use `api.registerAgentToolResultMiddleware(...)` when they
need async tool-result rewriting before the model sees the output. Declare the
targeted harnesses in `contracts.agentToolResultMiddleware`, for example
`["pi", "codex-app-server"]`. This is a trusted bundled-plugin seam; external
targeted runtimes in `contracts.agentToolResultMiddleware`, for example
`["pi", "codex"]`. This is a trusted bundled-plugin seam; external
plugins should prefer regular OpenClaw plugin hooks unless OpenClaw grows an
explicit trust policy for this capability.

View File

@@ -25,7 +25,7 @@ These are in-process OpenClaw hooks, not Codex `hooks.json` command hooks:
- `before_message_write` for mirrored transcript records
- `agent_end`
Plugins can also register harness-neutral tool-result middleware to rewrite
Plugins can also register runtime-neutral tool-result middleware to rewrite
OpenClaw dynamic tool results after OpenClaw executes the tool and before the
result is returned to Codex. This is separate from the public
`tool_result_persist` plugin hook, which transforms OpenClaw-owned transcript
@@ -35,8 +35,8 @@ The harness is off by default. New configs should keep OpenAI model refs
canonical as `openai/gpt-*` and explicitly force
`embeddedHarness.runtime: "codex"` or `OPENCLAW_AGENT_RUNTIME=codex` when they
want native app-server execution. Legacy `codex/*` model refs still auto-select
the harness for compatibility, but they are not shown as normal model/provider
choices.
the harness for compatibility, but runtime-backed legacy provider prefixes are
not shown as normal model/provider choices.
## Pick the right model prefix
@@ -56,10 +56,12 @@ app-server harness. Direct API-key access for `openai/gpt-5.5` is supported
once OpenAI enables GPT-5.5 on the public API.
Legacy `codex/gpt-*` refs remain accepted as compatibility aliases. Doctor
compatibility migration rewrites legacy primary `codex/*` refs to `openai/*`
and records the Codex harness policy separately. New PI Codex OAuth configs
should use `openai-codex/gpt-*`; new native app-server harness configs should
use `openai/gpt-*` plus `embeddedHarness.runtime: "codex"`.
compatibility migration rewrites legacy primary runtime refs to canonical model
refs and records the runtime policy separately, while fallback-only legacy refs
are left unchanged because runtime is configured for the whole agent container.
New PI Codex OAuth configs should use `openai-codex/gpt-*`; new native
app-server harness configs should use `openai/gpt-*` plus
`embeddedHarness.runtime: "codex"`.
`agents.defaults.imageModel` follows the same prefix split. Use
`openai-codex/gpt-*` when image understanding should run through the OpenAI
@@ -86,9 +88,9 @@ Legacy sessions created before harness pins are treated as PI-pinned once they
have transcript history. Use `/new` or `/reset` to opt that conversation into
Codex after changing config.
`/status` shows the effective non-PI harness next to `Fast`, for example
`Fast · codex`. The default PI harness remains `Runner: pi (embedded)` and does
not add a separate harness badge.
`/status` shows the effective model runtime. The default PI harness appears as
`Runtime: OpenClaw Pi Default`, and the Codex app-server harness appears as
`Runtime: OpenAI Codex`.
## Requirements

View File

@@ -406,7 +406,7 @@ read without importing the plugin runtime.
```json
{
"contracts": {
"agentToolResultMiddleware": ["pi", "codex-app-server"],
"agentToolResultMiddleware": ["pi", "codex"],
"externalAuthProviders": ["acme-ai"],
"speechProviders": ["openai"],
"realtimeTranscriptionProviders": ["openai"],
@@ -427,7 +427,7 @@ Each list is optional:
| Field | Type | What it means |
| -------------------------------- | ---------- | --------------------------------------------------------------------- |
| `embeddedExtensionFactories` | `string[]` | Deprecated embedded extension factory ids. |
| `agentToolResultMiddleware` | `string[]` | Harness ids a bundled plugin may register tool-result middleware for. |
| `agentToolResultMiddleware` | `string[]` | Runtime ids a bundled plugin may register tool-result middleware for. |
| `externalAuthProviders` | `string[]` | Provider ids whose external auth profile hook this plugin owns. |
| `speechProviders` | `string[]` | Speech provider ids this plugin owns. |
| `realtimeTranscriptionProviders` | `string[]` | Realtime-transcription provider ids this plugin owns. |

View File

@@ -146,15 +146,15 @@ OpenClaw only runs against the protocol surface it has been tested with.
### Tool-result middleware
Bundled plugins can attach harness-neutral tool-result middleware through
Bundled plugins can attach runtime-neutral tool-result middleware through
`api.registerAgentToolResultMiddleware(...)` when their manifest declares the
targeted harness ids in `contracts.agentToolResultMiddleware`. This trusted
targeted runtime ids in `contracts.agentToolResultMiddleware`. This trusted
seam is for async tool-result transforms that must run before PI or Codex feeds
tool output back into the model.
Legacy bundled plugins can still use
`api.registerCodexAppServerExtensionFactory(...)` for Codex app-server-only
middleware, but new result transforms should use the harness-neutral API.
middleware, but new result transforms should use the runtime-neutral API.
The Pi-only `api.registerEmbeddedExtensionFactory(...)` hook is deprecated for
tool-result transforms; keep it only for bundled compatibility code that still
needs direct Pi embedded-runner events.

View File

@@ -93,7 +93,7 @@ releases.
<Step title="Migrate Pi tool-result extensions to middleware">
Bundled plugins should replace Pi-only
`api.registerEmbeddedExtensionFactory(...)` tool-result handlers with
harness-neutral middleware.
runtime-neutral middleware.
```typescript
// Before: Pi-only compatibility hook
@@ -103,11 +103,11 @@ releases.
});
});
// After: Pi and Codex app-server dynamic tools
// After: Pi and Codex runtime dynamic tools
api.registerAgentToolResultMiddleware(async (event) => {
return compactToolResult(event);
}, {
harnesses: ["pi", "codex-app-server"],
runtimes: ["pi", "codex"],
});
```
@@ -116,7 +116,7 @@ releases.
```json
{
"contracts": {
"agentToolResultMiddleware": ["pi", "codex-app-server"]
"agentToolResultMiddleware": ["pi", "codex"]
}
}
```
@@ -626,7 +626,7 @@ canonical replacement.
Covered in "How to migrate → Migrate Pi tool-result extensions to
middleware" above. Included here for completeness: the Pi-only
`api.registerEmbeddedExtensionFactory(...)` path is deprecated in favor of
`api.registerAgentToolResultMiddleware(...)` with an explicit harness
`api.registerAgentToolResultMiddleware(...)` with an explicit runtime
list in `contracts.agentToolResultMiddleware`.
</Accordion>

View File

@@ -99,7 +99,7 @@ methods:
| `api.registerCli(registrar, opts?)` | CLI subcommand |
| `api.registerService(service)` | Background service |
| `api.registerInteractiveHandler(registration)` | Interactive handler |
| `api.registerAgentToolResultMiddleware(...)` | Harness tool-result middleware |
| `api.registerAgentToolResultMiddleware(...)` | Runtime tool-result middleware |
| `api.registerEmbeddedExtensionFactory(factory)` | Deprecated PI extension factory |
| `api.registerMemoryPromptSupplement(builder)` | Additive memory-adjacent prompt section |
| `api.registerMemoryCorpusSupplement(adapter)` | Additive memory search/read corpus |
@@ -113,12 +113,12 @@ methods:
<Accordion title="When to use tool-result middleware">
Bundled plugins can use `api.registerAgentToolResultMiddleware(...)` when
they need to rewrite a tool result after execution and before the harness
feeds that result back into the model. This is the trusted harness-neutral
they need to rewrite a tool result after execution and before the runtime
feeds that result back into the model. This is the trusted runtime-neutral
seam for async output reducers such as tokenjuice.
Bundled plugins must declare `contracts.agentToolResultMiddleware` for each
targeted harness, for example `["pi", "codex-app-server"]`. External plugins
targeted runtime, for example `["pi", "codex"]`. External plugins
cannot register this middleware; keep normal OpenClaw plugin hooks for work
that does not need pre-model tool-result timing.
</Accordion>

View File

@@ -13,7 +13,8 @@ Gemini Grounding.
- Provider: `google`
- Auth: `GEMINI_API_KEY` or `GOOGLE_API_KEY`
- API: Google Gemini API
- Alternative provider: `google-gemini-cli` (OAuth)
- Runtime option: `agents.defaults.embeddedHarness.runtime: "google-gemini-cli"`
reuses Gemini CLI OAuth while keeping model refs canonical as `google/*`.
## Getting started
@@ -92,12 +93,13 @@ Choose your preferred auth method and follow the setup steps.
</Step>
<Step title="Verify the model is available">
```bash
openclaw models list --provider google-gemini-cli
openclaw models list --provider google
```
</Step>
</Steps>
- Default model: `google-gemini-cli/gemini-3-flash-preview`
- Default model: `google/gemini-3.1-pro-preview`
- Runtime: `google-gemini-cli`
- Alias: `gemini-cli`
**Environment variables:**
@@ -117,9 +119,9 @@ Choose your preferred auth method and follow the setup steps.
command is installed and on `PATH`.
</Note>
The OAuth-only `google-gemini-cli` provider is a separate text-inference
surface. Image generation, media understanding, and Gemini Grounding stay on
the `google` provider id.
`google-gemini-cli/*` model refs are legacy compatibility aliases. New
configs should use `google/*` model refs plus the `google-gemini-cli`
runtime when they want local Gemini CLI execution.
</Tab>
</Tabs>

View File

@@ -174,11 +174,10 @@ Choose your preferred auth method and follow the setup steps.
### Status indicator
Chat `/status` shows which embedded harness is active for the current
session. The default PI harness appears as `Runner: pi (embedded)` and does
not add a separate badge. When the bundled Codex app-server harness is
selected, `/status` appends the non-PI harness id next to `Fast`, for example
`Fast · codex`. Existing sessions keep their recorded harness id, so use
Chat `/status` shows which model runtime is active for the current session.
The default PI harness appears as `Runtime: OpenClaw Pi Default`. When the
bundled Codex app-server harness is selected, `/status` shows
`Runtime: OpenAI Codex`. Existing sessions keep their recorded harness id, so use
`/new` or `/reset` after changing `embeddedHarness` if you want `/status` to
reflect a new PI/Codex choice.

View File

@@ -106,7 +106,7 @@ Built-in commands available today:
- `/help` shows the short help summary.
- `/commands` shows the generated command catalog.
- `/tools [compact|verbose]` shows what the current agent can use right now.
- `/status` shows runtime status, including `Runtime`/`Runner` labels and provider usage/quota when available.
- `/status` shows execution/runtime status, including `Execution`/`Runtime` labels and provider usage/quota when available.
- `/tasks` lists active/recent background tasks for the current session.
- `/context [list|detail|json]` explains how context is assembled.
- `/export-session [path]` exports the current session to HTML. Alias: `/export`.
@@ -227,7 +227,7 @@ of treating `/tools` as a static catalog.
- **Provider usage/quota** (example: “Claude 80% left”) shows up in `/status` for the current model provider when usage tracking is enabled. OpenClaw normalizes provider windows to `% left`; for MiniMax, remaining-only percent fields are inverted before display, and `model_remains` responses prefer the chat-model entry plus a model-tagged plan label.
- **Token/cache lines** in `/status` can fall back to the latest transcript usage entry when the live session snapshot is sparse. Existing nonzero live values still win, and transcript fallback can also recover the active runtime model label plus a larger prompt-oriented total when stored totals are missing or smaller.
- **Runtime vs runner:** `/status` reports `Runtime` for the effective execution path and sandbox state, and `Runner` for who is actually running the session: embedded Pi, a CLI-backed provider, or an ACP harness/backend.
- **Execution vs runtime:** `/status` reports `Execution` for the effective sandbox path and `Runtime` for who is actually running the session: `OpenClaw Pi Default`, `OpenAI Codex`, a CLI backend, or an ACP backend.
- **Per-response tokens/cost** is controlled by `/usage off|tokens|full` (appended to normal replies).
- `/model status` is about **models/auth/endpoints**, not usage.