mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 07:30:43 +00:00
Make harness failures fail honestly (#69981)
* Agents: fail honestly on harness errors * Docs: clarify Codex harness fallback
This commit is contained in:
2
.github/workflows/ci-check-testbox.yml
vendored
2
.github/workflows/ci-check-testbox.yml
vendored
@@ -8,7 +8,7 @@ on:
|
||||
required: true
|
||||
pull_request:
|
||||
paths:
|
||||
- '.github/workflows/**'
|
||||
- ".github/workflows/**"
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
@@ -33,6 +33,7 @@ Docs: https://docs.openclaw.ai
|
||||
- Doctor/channels: merge configured-channel doctor hooks across read-only, loaded, setup, and runtime plugin discovery so partial adapters no longer hide runtime-only compatibility repair or allowlist warnings, preserve disabled-channel opt-outs, and ignore malformed hook values before they can mask valid fallbacks. (#69919) Thanks @gumadeiras.
|
||||
- Models/CLI: show bundled provider-owned static catalog rows in `models list --all` before auth is configured, including Kimi K2.6 rows for Moonshot, OpenRouter, and Vercel AI Gateway, while keeping local-only and workspace plugin catalog paths isolated. (#69909) Thanks @shakkernerd.
|
||||
- Configure: skip generic CLI startup bootstrap for `openclaw configure` and bound hint-only gateway probes so the onboarding TUI reaches its first prompt faster when the Gateway is unavailable. (#69984) Thanks @obviyus.
|
||||
- Agents/harness: surface selected plugin harness failures directly instead of replaying the same turn through embedded PI, preventing misleading secondary PI auth errors and avoiding duplicate side effects.
|
||||
|
||||
## 2026.4.21
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
1022a9497ff0481675c483742b8e92c6063e53c9bb3e5c5c3bd39300cf2e1f31 config-baseline.json
|
||||
7956c319e82d288d496a51cb2ff4485ab72ef4900cb089f99e1df8b9ef3bfb73 config-baseline.core.json
|
||||
7c1b8b34618f44d56817ff54b930701710087dc7e76beaf4a554b6a5a25ba87c config-baseline.json
|
||||
ed0c093e8acab2364608be3e65b98836600aea07df73ebb51d11919969c6c8fe config-baseline.core.json
|
||||
6c0069b971ae298ae68516ebcd3eae0e8c82820d2e8f42ecbd2f53a2f9077371 config-baseline.channel.json
|
||||
a7f297a3461e807fd15f8a7c8c68e41071dfc09af2118c24a26d5f534301a654 config-baseline.plugin.json
|
||||
e5b7756b5f45ba227aa1bfab990dcf8a2a8b409b9ca01ea8bb1d5cd7adc06c90 config-baseline.plugin.json
|
||||
|
||||
@@ -1266,7 +1266,7 @@ Codex app-server harness.
|
||||
```
|
||||
|
||||
- `runtime`: `"auto"`, `"pi"`, or a registered plugin harness id. The bundled Codex plugin registers `codex`.
|
||||
- `fallback`: `"pi"` or `"none"`. `"pi"` keeps the built-in PI harness as the compatibility fallback. `"none"` makes missing or unsupported plugin harness selection fail instead of silently using PI.
|
||||
- `fallback`: `"pi"` or `"none"`. `"pi"` keeps the built-in PI harness as the compatibility fallback when no plugin harness is selected. `"none"` makes missing or unsupported plugin harness selection fail instead of silently using PI. Selected plugin harness failures always surface directly.
|
||||
- Environment overrides: `OPENCLAW_AGENT_RUNTIME=<id|auto|pi>` overrides `runtime`; `OPENCLAW_AGENT_HARNESS_FALLBACK=none` disables PI fallback for that process.
|
||||
- For Codex-only deployments, set `model: "codex/gpt-5.4"`, `embeddedHarness.runtime: "codex"`, and `embeddedHarness.fallback: "none"`.
|
||||
- This only controls the embedded chat harness. Media generation, vision, PDF, music, video, and TTS still use their provider/model settings.
|
||||
|
||||
@@ -469,8 +469,12 @@ understanding continue to use the matching provider/model settings such as
|
||||
**Codex does not appear in `/model`:** enable `plugins.entries.codex.enabled`,
|
||||
set a `codex/*` model ref, or check whether `plugins.allow` excludes `codex`.
|
||||
|
||||
**OpenClaw falls back to PI:** set `embeddedHarness.fallback: "none"` or
|
||||
`OPENCLAW_AGENT_HARNESS_FALLBACK=none` while testing.
|
||||
**OpenClaw uses PI instead of Codex:** if no Codex harness claims the run,
|
||||
OpenClaw may use PI as the compatibility backend. Set
|
||||
`embeddedHarness.runtime: "codex"` to force Codex selection while testing, or
|
||||
`embeddedHarness.fallback: "none"` to fail when no plugin harness matches. Once
|
||||
Codex app-server is selected, its failures surface directly without extra
|
||||
fallback config.
|
||||
|
||||
**The app-server is rejected:** upgrade Codex so the app-server handshake
|
||||
reports version `0.118.0` or newer.
|
||||
|
||||
@@ -94,10 +94,11 @@ OpenClaw chooses a harness after provider/model resolution:
|
||||
4. If no registered harness matches, OpenClaw uses PI unless PI fallback is
|
||||
disabled.
|
||||
|
||||
Forced plugin harness failures surface as run failures. In `auto` mode,
|
||||
OpenClaw may fall back to PI when the selected plugin harness fails before a
|
||||
turn has produced side effects. Set `OPENCLAW_AGENT_HARNESS_FALLBACK=none` or
|
||||
`embeddedHarness.fallback: "none"` to make that fallback a hard failure instead.
|
||||
Plugin harness failures surface as run failures. In `auto` mode, PI fallback is
|
||||
only used when no registered plugin harness supports the resolved
|
||||
provider/model. Once a plugin harness has claimed a run, OpenClaw does not
|
||||
replay that same turn through PI because that can change auth/runtime semantics
|
||||
or duplicate side effects.
|
||||
|
||||
The bundled Codex plugin registers `codex` as its harness id. Core treats that
|
||||
as an ordinary plugin harness id; Codex-specific aliases belong in the plugin
|
||||
@@ -149,19 +150,20 @@ When this mode runs, Codex owns the native thread id, resume behavior,
|
||||
compaction, and app-server execution. OpenClaw still owns the chat channel,
|
||||
visible transcript mirror, tool policy, approvals, media delivery, and session
|
||||
selection. Use `embeddedHarness.runtime: "codex"` with
|
||||
`embeddedHarness.fallback: "none"` when you need to prove that the Codex
|
||||
app-server path is used and PI fallback is not hiding a broken native harness.
|
||||
`embeddedHarness.fallback: "none"` when you need to prove that only the Codex
|
||||
app-server path can claim the run. That config is only a selection guard:
|
||||
Codex app-server failures already fail directly instead of retrying through PI.
|
||||
|
||||
## Disable PI fallback
|
||||
|
||||
By default, OpenClaw runs embedded agents with `agents.defaults.embeddedHarness`
|
||||
set to `{ runtime: "auto", fallback: "pi" }`. In `auto` mode, registered plugin
|
||||
harnesses can claim a provider/model pair. If none match, or if an auto-selected
|
||||
plugin harness fails before producing output, OpenClaw falls back to PI.
|
||||
harnesses can claim a provider/model pair. If none match, OpenClaw falls back
|
||||
to PI.
|
||||
|
||||
Set `fallback: "none"` when you need to prove that a plugin harness is the only
|
||||
runtime being exercised. This disables automatic PI fallback; it does not block
|
||||
an explicit `runtime: "pi"` or `OPENCLAW_AGENT_RUNTIME=pi`.
|
||||
Set `fallback: "none"` when you need missing plugin harness selection to fail
|
||||
instead of using PI. Selected plugin harness failures already fail hard. This
|
||||
does not block an explicit `runtime: "pi"` or `OPENCLAW_AGENT_RUNTIME=pi`.
|
||||
|
||||
For Codex-only embedded runs:
|
||||
|
||||
|
||||
@@ -105,9 +105,8 @@ describe("runAgentHarnessAttemptWithFallback", () => {
|
||||
expect(piRunAttempt).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it("falls back to the PI harness in auto mode when the selected plugin harness fails", async () => {
|
||||
it("falls back to the PI harness in auto mode when no plugin harness matches", async () => {
|
||||
process.env.OPENCLAW_AGENT_RUNTIME = "auto";
|
||||
registerFailingCodexHarness();
|
||||
|
||||
const result = await runAgentHarnessAttemptWithFallback(createAttemptParams());
|
||||
|
||||
@@ -115,6 +114,16 @@ describe("runAgentHarnessAttemptWithFallback", () => {
|
||||
expect(piRunAttempt).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it("surfaces an auto-selected plugin harness failure instead of replaying through PI", async () => {
|
||||
process.env.OPENCLAW_AGENT_RUNTIME = "auto";
|
||||
registerFailingCodexHarness();
|
||||
|
||||
await expect(runAgentHarnessAttemptWithFallback(createAttemptParams())).rejects.toThrow(
|
||||
"codex startup failed",
|
||||
);
|
||||
expect(piRunAttempt).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("surfaces a forced plugin harness failure instead of replaying through PI", async () => {
|
||||
process.env.OPENCLAW_AGENT_RUNTIME = "codex";
|
||||
registerFailingCodexHarness();
|
||||
@@ -125,26 +134,15 @@ describe("runAgentHarnessAttemptWithFallback", () => {
|
||||
expect(piRunAttempt).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("disables PI retry fallback when auto-selected harness fails and fallback is none", async () => {
|
||||
process.env.OPENCLAW_AGENT_RUNTIME = "auto";
|
||||
registerFailingCodexHarness();
|
||||
|
||||
await expect(
|
||||
runAgentHarnessAttemptWithFallback(
|
||||
createAttemptParams({ agents: { defaults: { embeddedHarness: { fallback: "none" } } } }),
|
||||
),
|
||||
).rejects.toThrow("codex startup failed");
|
||||
expect(piRunAttempt).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it("honors env fallback override over config fallback", async () => {
|
||||
process.env.OPENCLAW_AGENT_RUNTIME = "auto";
|
||||
process.env.OPENCLAW_AGENT_HARNESS_FALLBACK = "none";
|
||||
registerFailingCodexHarness();
|
||||
|
||||
await expect(runAgentHarnessAttemptWithFallback(createAttemptParams())).rejects.toThrow(
|
||||
"codex startup failed",
|
||||
);
|
||||
await expect(
|
||||
runAgentHarnessAttemptWithFallback(
|
||||
createAttemptParams({ agents: { defaults: { embeddedHarness: { fallback: "pi" } } } }),
|
||||
),
|
||||
).rejects.toThrow("PI fallback is disabled");
|
||||
expect(piRunAttempt).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import type { AgentEmbeddedHarnessConfig } from "../../config/types.agents-shared.js";
|
||||
import type { OpenClawConfig } from "../../config/types.openclaw.js";
|
||||
import { formatErrorMessage } from "../../infra/errors.js";
|
||||
import { createSubsystemLogger } from "../../logging/subsystem.js";
|
||||
import { normalizeAgentId } from "../../routing/session-key.js";
|
||||
import { listAgentEntries, resolveSessionAgentIds } from "../agent-scope.js";
|
||||
@@ -108,13 +109,6 @@ export function selectAgentHarness(params: {
|
||||
export async function runAgentHarnessAttemptWithFallback(
|
||||
params: EmbeddedRunAttemptParams,
|
||||
): Promise<EmbeddedRunAttemptResult> {
|
||||
const policy = resolveAgentHarnessPolicy({
|
||||
provider: params.provider,
|
||||
modelId: params.modelId,
|
||||
config: params.config,
|
||||
agentId: params.agentId,
|
||||
sessionKey: params.sessionKey,
|
||||
});
|
||||
const harness = selectAgentHarness({
|
||||
provider: params.provider,
|
||||
modelId: params.modelId,
|
||||
@@ -129,11 +123,13 @@ export async function runAgentHarnessAttemptWithFallback(
|
||||
try {
|
||||
return await harness.runAttempt(params);
|
||||
} catch (error) {
|
||||
if (policy.runtime !== "auto" || policy.fallback === "none") {
|
||||
throw error;
|
||||
}
|
||||
log.warn(`${harness.label} failed; falling back to embedded PI backend`, { error });
|
||||
return createPiAgentHarness().runAttempt(params);
|
||||
log.warn(`${harness.label} failed; not falling back to embedded PI backend`, {
|
||||
harnessId: harness.id,
|
||||
provider: params.provider,
|
||||
modelId: params.modelId,
|
||||
error: formatErrorMessage(error),
|
||||
});
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -3023,7 +3023,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
enum: ["pi", "none"],
|
||||
title: "Default Embedded Harness Fallback",
|
||||
description:
|
||||
"Embedded harness fallback when no plugin harness matches or an auto-selected plugin harness fails before side effects. Set none to disable automatic PI fallback.",
|
||||
"Embedded harness fallback when no plugin harness matches. Selected plugin harness failures surface directly. Set none to disable automatic PI fallback.",
|
||||
},
|
||||
},
|
||||
additionalProperties: false,
|
||||
@@ -5721,7 +5721,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
additionalProperties: false,
|
||||
title: "Agent Embedded Harness",
|
||||
description:
|
||||
"Per-agent embedded harness policy override. Use fallback=none to make this agent fail instead of falling back to PI.",
|
||||
"Per-agent embedded harness policy override. Use fallback=none to make missing plugin harness selection fail instead of falling back to PI.",
|
||||
},
|
||||
model: {
|
||||
anyOf: [
|
||||
@@ -23416,7 +23416,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
},
|
||||
"agents.defaults.embeddedHarness.fallback": {
|
||||
label: "Default Embedded Harness Fallback",
|
||||
help: "Embedded harness fallback when no plugin harness matches or an auto-selected plugin harness fails before side effects. Set none to disable automatic PI fallback.",
|
||||
help: "Embedded harness fallback when no plugin harness matches. Selected plugin harness failures surface directly. Set none to disable automatic PI fallback.",
|
||||
tags: ["reliability"],
|
||||
},
|
||||
"agents.list": {
|
||||
@@ -23461,7 +23461,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA: BaseConfigSchemaResponse = {
|
||||
},
|
||||
"agents.list.*.embeddedHarness": {
|
||||
label: "Agent Embedded Harness",
|
||||
help: "Per-agent embedded harness policy override. Use fallback=none to make this agent fail instead of falling back to PI.",
|
||||
help: "Per-agent embedded harness policy override. Use fallback=none to make missing plugin harness selection fail instead of falling back to PI.",
|
||||
tags: ["advanced"],
|
||||
},
|
||||
"agents.list.*.embeddedHarness.runtime": {
|
||||
|
||||
@@ -1145,9 +1145,9 @@ export const FIELD_HELP: Record<string, string> = {
|
||||
"agents.defaults.embeddedHarness.runtime":
|
||||
"Embedded harness runtime: auto, pi, or a registered plugin harness id such as codex.",
|
||||
"agents.defaults.embeddedHarness.fallback":
|
||||
"Embedded harness fallback when no plugin harness matches or an auto-selected plugin harness fails before side effects. Set none to disable automatic PI fallback.",
|
||||
"Embedded harness fallback when no plugin harness matches. Selected plugin harness failures surface directly. Set none to disable automatic PI fallback.",
|
||||
"agents.list.*.embeddedHarness":
|
||||
"Per-agent embedded harness policy override. Use fallback=none to make this agent fail instead of falling back to PI.",
|
||||
"Per-agent embedded harness policy override. Use fallback=none to make missing plugin harness selection fail instead of falling back to PI.",
|
||||
"agents.list.*.embeddedHarness.runtime":
|
||||
"Per-agent embedded harness runtime: auto, pi, or a registered plugin harness id such as codex.",
|
||||
"agents.list.*.embeddedHarness.fallback":
|
||||
|
||||
Reference in New Issue
Block a user