Files
openclaw/src/agents/model-fallback.run-embedded.e2e.test.ts
Peter Steinberger bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341)
* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types
2026-05-27 19:24:04 +01:00

874 lines
29 KiB
TypeScript

import fs from "node:fs/promises";
import os from "node:os";
import path from "node:path";
import { beforeAll, beforeEach, describe, expect, it, vi } from "vitest";
import type { OpenClawConfig } from "../config/config.js";
import type { AuthProfileFailureReason } from "./auth-profiles.js";
import { classifyEmbeddedAgentRunResultForModelFallback } from "./embedded-agent-runner/result-fallback-classifier.js";
import type { EmbeddedRunAttemptResult } from "./embedded-agent-runner/run/types.js";
import { runWithModelFallback } from "./model-fallback.js";
import {
buildEmbeddedRunnerAssistant,
createResolvedEmbeddedRunnerModel,
makeEmbeddedRunnerAttempt,
} from "./test-helpers/embedded-agent-runner-e2e-fixtures.js";
import {
installEmbeddedRunnerBackoffE2eMocks,
installEmbeddedRunnerBaseE2eMocks,
installEmbeddedRunnerFastRunE2eMocks,
} from "./test-helpers/embedded-agent-runner-e2e-mocks.js";
const runEmbeddedAttemptMock = vi.fn<(params: unknown) => Promise<EmbeddedRunAttemptResult>>();
const { computeBackoffMock, sleepWithAbortMock } = vi.hoisted(() => ({
computeBackoffMock: vi.fn(
(
_policy: { initialMs: number; maxMs: number; factor: number; jitter: number },
_attempt: number,
) => 321,
),
sleepWithAbortMock: vi.fn(async (_ms: number, _abortSignal?: AbortSignal) => undefined),
}));
vi.mock("./models-config.js", async () => {
const mod = await vi.importActual<typeof import("./models-config.js")>("./models-config.js");
return {
...mod,
ensureOpenClawModelsJson: vi.fn(async () => ({ wrote: false })),
};
});
const installRunEmbeddedMocks = () => {
installEmbeddedRunnerBaseE2eMocks();
installEmbeddedRunnerFastRunE2eMocks({
runEmbeddedAttempt: (params) => runEmbeddedAttemptMock(params),
});
installEmbeddedRunnerBackoffE2eMocks({
computeBackoff: (policy, attempt) => computeBackoffMock(policy, attempt),
sleepWithAbort: (ms, abortSignal) => sleepWithAbortMock(ms, abortSignal),
});
vi.doMock("./embedded-agent-runner/model.js", () => ({
resolveModelAsync: async (provider: string, modelId: string) =>
createResolvedEmbeddedRunnerModel(provider, modelId),
}));
};
let runEmbeddedAgent: typeof import("./embedded-agent-runner/run.js").runEmbeddedAgent;
beforeAll(async () => {
vi.resetModules();
installRunEmbeddedMocks();
({ runEmbeddedAgent } = await import("./embedded-agent-runner/run.js"));
});
beforeEach(() => {
runEmbeddedAttemptMock.mockReset();
computeBackoffMock.mockClear();
sleepWithAbortMock.mockClear();
});
const OVERLOADED_ERROR_PAYLOAD =
'{"type":"error","error":{"type":"overloaded_error","message":"Overloaded"}}';
const RATE_LIMIT_ERROR_MESSAGE = "rate limit exceeded";
const NO_ENDPOINTS_FOUND_ERROR_MESSAGE = "404 No endpoints found for deepseek/deepseek-r1:free.";
type EmbeddedAttemptParams = {
provider: string;
modelId?: string;
authProfileId?: string;
};
function makeConfig(): OpenClawConfig {
const apiKeyField = ["api", "Key"].join("");
return {
agents: {
defaults: {
model: {
primary: "openai/mock-1",
fallbacks: ["groq/mock-2"],
},
},
},
models: {
providers: {
openai: {
api: "openai-responses",
[apiKeyField]: "openai-test-key", // pragma: allowlist secret
baseUrl: "https://example.com/openai",
models: [
{
id: "mock-1",
name: "Mock 1",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 16_000,
maxTokens: 2048,
},
],
},
groq: {
api: "openai-responses",
[apiKeyField]: "groq-test-key", // pragma: allowlist secret
baseUrl: "https://example.com/groq",
models: [
{
id: "mock-2",
name: "Mock 2",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 16_000,
maxTokens: 2048,
},
],
},
},
},
} satisfies OpenClawConfig;
}
async function withAgentWorkspace<T>(
fn: (ctx: { agentDir: string; workspaceDir: string }) => Promise<T>,
): Promise<T> {
const root = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-model-fallback-"));
const agentDir = path.join(root, "agent");
const workspaceDir = path.join(root, "workspace");
await fs.mkdir(agentDir, { recursive: true });
await fs.mkdir(workspaceDir, { recursive: true });
try {
return await fn({ agentDir, workspaceDir });
} finally {
await fs.rm(root, { recursive: true, force: true });
}
}
async function writeAuthStore(
agentDir: string,
usageStats?: Record<
string,
{
lastUsed?: number;
cooldownUntil?: number;
disabledUntil?: number;
disabledReason?: AuthProfileFailureReason;
failureCounts?: Partial<Record<AuthProfileFailureReason, number>>;
}
>,
) {
await fs.writeFile(
path.join(agentDir, "auth-profiles.json"),
JSON.stringify({
version: 1,
profiles: {
"openai:p1": { type: "api_key", provider: "openai", key: "sk-openai" },
"groq:p1": { type: "api_key", provider: "groq", key: "sk-groq" },
},
}),
);
await fs.writeFile(
path.join(agentDir, "auth-state.json"),
JSON.stringify({
version: 1,
usageStats:
usageStats ??
({
"openai:p1": { lastUsed: 1 },
"groq:p1": { lastUsed: 2 },
} as const),
}),
);
}
async function readUsageStats(agentDir: string) {
const raw = await fs.readFile(path.join(agentDir, "auth-state.json"), "utf-8");
return JSON.parse(raw).usageStats as Record<string, Record<string, unknown> | undefined>;
}
function expectFailureCount(
usageStats: Record<string, Record<string, unknown> | undefined>,
profileId: string,
reason: AuthProfileFailureReason,
expected: number,
) {
const failureCounts = usageStats[profileId]?.failureCounts as Record<string, unknown> | undefined;
expect(failureCounts?.[reason]).toBe(expected);
}
async function writeMultiProfileAuthStore(agentDir: string) {
await fs.writeFile(
path.join(agentDir, "auth-profiles.json"),
JSON.stringify({
version: 1,
profiles: {
"openai:p1": { type: "api_key", provider: "openai", key: "sk-openai-1" },
"openai:p2": { type: "api_key", provider: "openai", key: "sk-openai-2" },
"openai:p3": { type: "api_key", provider: "openai", key: "sk-openai-3" },
"groq:p1": { type: "api_key", provider: "groq", key: "sk-groq" },
},
}),
);
await fs.writeFile(
path.join(agentDir, "auth-state.json"),
JSON.stringify({
version: 1,
usageStats: {
"openai:p1": { lastUsed: 1 },
"openai:p2": { lastUsed: 2 },
"openai:p3": { lastUsed: 3 },
"groq:p1": { lastUsed: 4 },
},
}),
);
}
async function runEmbeddedFallback(params: {
agentDir: string;
workspaceDir: string;
sessionKey: string;
runId: string;
abortSignal?: AbortSignal;
config?: OpenClawConfig;
}) {
const cfg = params.config ?? makeConfig();
return await runWithModelFallback({
cfg,
provider: "openai",
model: "mock-1",
runId: params.runId,
agentDir: params.agentDir,
run: (provider, model, options) =>
runEmbeddedAgent({
sessionId: `session:${params.runId}`,
sessionKey: params.sessionKey,
sessionFile: path.join(params.workspaceDir, `${params.runId}.jsonl`),
workspaceDir: params.workspaceDir,
agentDir: params.agentDir,
config: cfg,
prompt: "hello",
provider,
model,
authProfileIdSource: "auto",
allowTransientCooldownProbe: options?.allowTransientCooldownProbe,
timeoutMs: 5_000,
runId: params.runId,
abortSignal: params.abortSignal,
enqueue: async (task) => await task(),
}),
});
}
function mockPrimaryOverloadedThenFallbackSuccess() {
mockPrimaryErrorThenFallbackSuccess(OVERLOADED_ERROR_PAYLOAD);
}
function makeFallbackSuccessAttempt(): EmbeddedRunAttemptResult {
return makeEmbeddedRunnerAttempt({
assistantTexts: ["fallback ok"],
lastAssistant: buildEmbeddedRunnerAssistant({
provider: "groq",
model: "mock-2",
stopReason: "stop",
content: [{ type: "text", text: "fallback ok" }],
}),
});
}
function mockPrimaryFailureThenFallbackSuccess(
makePrimaryAttempt: (
attemptParams: EmbeddedAttemptParams,
) => EmbeddedRunAttemptResult | Promise<EmbeddedRunAttemptResult>,
) {
runEmbeddedAttemptMock.mockImplementation(async (params: unknown) => {
const attemptParams = params as EmbeddedAttemptParams;
if (attemptParams.provider === "openai") {
return await makePrimaryAttempt(attemptParams);
}
if (attemptParams.provider === "groq") {
return makeFallbackSuccessAttempt();
}
throw new Error(`Unexpected provider ${attemptParams.provider}`);
});
}
function mockPrimaryPromptErrorThenFallbackSuccess(errorMessage: string) {
mockPrimaryFailureThenFallbackSuccess(() =>
makeEmbeddedRunnerAttempt({
promptError: new Error(errorMessage),
}),
);
}
function mockPrimaryErrorThenFallbackSuccess(errorMessage: string) {
mockPrimaryFailureThenFallbackSuccess(() =>
makeEmbeddedRunnerAttempt({
assistantTexts: [],
lastAssistant: buildEmbeddedRunnerAssistant({
provider: "openai",
model: "mock-1",
stopReason: "error",
errorMessage,
}),
}),
);
}
function mockPrimaryStaleRateLimitTextSuccess(errorMessage: string) {
mockPrimaryFailureThenFallbackSuccess(() =>
makeEmbeddedRunnerAttempt({
assistantTexts: ["primary ok"],
lastAssistant: buildEmbeddedRunnerAssistant({
provider: "openai",
model: "mock-1",
stopReason: "stop",
content: [{ type: "text", text: "primary ok" }],
errorMessage,
}),
}),
);
}
function expectOpenAiThenGroqAttemptOrder(params?: { expectOpenAiAuthProfileId?: string }) {
expect(runEmbeddedAttemptMock).toHaveBeenCalledTimes(2);
const firstCall = runEmbeddedAttemptMock.mock.calls[0]?.[0] as
| { provider?: string; authProfileId?: string }
| undefined;
const secondCall = runEmbeddedAttemptMock.mock.calls[1]?.[0] as { provider?: string } | undefined;
if (!firstCall || !secondCall) {
throw new Error("expected primary and fallback embedded run attempts");
}
expect(firstCall.provider).toBe("openai");
if (params?.expectOpenAiAuthProfileId) {
expect(firstCall.authProfileId).toBe(params.expectOpenAiAuthProfileId);
}
expect(secondCall.provider).toBe("groq");
}
function mockAllProvidersOverloaded() {
runEmbeddedAttemptMock.mockImplementation(async (params: unknown) => {
const attemptParams = params as { provider: string; modelId: string; authProfileId?: string };
if (attemptParams.provider === "openai" || attemptParams.provider === "groq") {
return makeEmbeddedRunnerAttempt({
assistantTexts: [],
lastAssistant: buildEmbeddedRunnerAssistant({
provider: attemptParams.provider,
model: attemptParams.provider === "openai" ? "mock-1" : "mock-2",
stopReason: "error",
errorMessage: OVERLOADED_ERROR_PAYLOAD,
}),
});
}
throw new Error(`Unexpected provider ${attemptParams.provider}`);
});
}
function countProviderAttempts(provider: string) {
return runEmbeddedAttemptMock.mock.calls.filter(
(call) => (call[0] as { provider?: string })?.provider === provider,
).length;
}
function expectProviderAttemptCounts(expected: { openai: number; groq: number }) {
expect(countProviderAttempts("openai")).toBe(expected.openai);
expect(countProviderAttempts("groq")).toBe(expected.groq);
}
describe("runWithModelFallback + runEmbeddedAgent failover behavior", () => {
it("keeps tool summary on incomplete side-effect terminal results", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
runEmbeddedAttemptMock.mockResolvedValueOnce(
makeEmbeddedRunnerAttempt({
toolMetas: [{ toolName: "write", meta: "path=out.txt" }],
lastAssistant: buildEmbeddedRunnerAssistant({
provider: "openai",
model: "mock-1",
stopReason: "stop",
content: [],
}),
}),
);
const result = await runEmbeddedAgent({
sessionId: "session:tool-side-effect-terminal",
sessionKey: "agent:test:tool-side-effect-terminal",
sessionFile: path.join(workspaceDir, "tool-side-effect-terminal.jsonl"),
workspaceDir,
agentDir,
config: makeConfig(),
prompt: "write the file",
provider: "openai",
model: "mock-1",
authProfileIdSource: "auto",
timeoutMs: 5_000,
runId: "run:tool-side-effect-terminal",
enqueue: async (task) => await task(),
});
expect(result.meta.toolSummary?.calls).toBe(1);
expect(result.meta.toolSummary?.tools).toEqual(["write"]);
expect(
classifyEmbeddedAgentRunResultForModelFallback({
provider: "openai-codex",
model: "gpt-5.4",
result,
}),
).toBeNull();
});
});
it("falls back on OpenRouter-style no-endpoints assistant errors", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
mockPrimaryErrorThenFallbackSuccess(NO_ENDPOINTS_FOUND_ERROR_MESSAGE);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:model-not-found-no-endpoints",
runId: "run:model-not-found-no-endpoints",
});
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expect(result.attempts[0]?.reason).toBe("model_not_found");
expect(result.result.payloads?.[0]?.text ?? "").toContain("fallback ok");
expectOpenAiThenGroqAttemptOrder();
});
});
it("falls back on timeout errors using defaults-only model fallbacks", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
mockPrimaryErrorThenFallbackSuccess("LLM request timed out.");
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:timeout-defaults-fallback",
runId: "run:timeout-defaults-fallback",
});
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expect(result.attempts[0]?.reason).toBe("timeout");
expect(result.result.payloads?.[0]?.text ?? "").toContain("fallback ok");
expectOpenAiThenGroqAttemptOrder();
});
});
it("falls back across providers after overloaded primary failure and persists transient cooldown", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
mockPrimaryOverloadedThenFallbackSuccess();
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:overloaded-cross-provider",
runId: "run:overloaded-cross-provider",
});
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expect(result.attempts[0]?.reason).toBe("overloaded");
expect(result.result.payloads?.[0]?.text ?? "").toContain("fallback ok");
const usageStats = await readUsageStats(agentDir);
expect(typeof usageStats["openai:p1"]?.cooldownUntil).toBe("number");
expectFailureCount(usageStats, "openai:p1", "overloaded", 1);
expect(typeof usageStats["groq:p1"]?.lastUsed).toBe("number");
expectOpenAiThenGroqAttemptOrder();
expect(computeBackoffMock).not.toHaveBeenCalled();
expect(sleepWithAbortMock).not.toHaveBeenCalled();
});
});
it("falls back across providers after bare Codex/Undici transport failures", async () => {
const cases = [
{
name: "undici-terminated",
message: "terminated",
},
{
name: "stream-read-error",
message: "stream_read_error",
},
{
name: "codex-empty-transport-response",
message: "Request failed",
},
] as const;
for (const { name, message } of cases) {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
runEmbeddedAttemptMock.mockClear();
computeBackoffMock.mockClear();
sleepWithAbortMock.mockClear();
mockPrimaryErrorThenFallbackSuccess(message);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: `agent:test:transport-fallback:${name}`,
runId: `run:transport-fallback:${name}`,
});
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expect(result.attempts[0]?.reason).toBe("timeout");
expect(result.result.payloads?.[0]?.text ?? "").toContain("fallback ok");
const usageStats = await readUsageStats(agentDir);
expect(usageStats["openai:p1"]?.cooldownUntil).toBeUndefined();
expect(usageStats["openai:p1"]?.failureCounts).toBeUndefined();
expect(typeof usageStats["groq:p1"]?.lastUsed).toBe("number");
expectOpenAiThenGroqAttemptOrder();
expect(computeBackoffMock).not.toHaveBeenCalled();
expect(sleepWithAbortMock).not.toHaveBeenCalled();
});
}
});
it("falls back across providers after a bare leading 402 quota-refresh assistant error", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
mockPrimaryErrorThenFallbackSuccess(
"402 You have reached your subscription quota limit. Please wait for automatic quota refresh in the rolling time window, upgrade to a higher plan, or use a Pay-As-You-Go API Key for unlimited access.",
);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:bare-402-cross-provider",
runId: "run:bare-402-cross-provider",
});
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expect(result.attempts[0]?.reason).toBe("rate_limit");
expect(result.result.payloads?.[0]?.text ?? "").toContain("fallback ok");
expectOpenAiThenGroqAttemptOrder();
});
});
it("surfaces a bounded overloaded summary when every fallback candidate is overloaded", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
mockAllProvidersOverloaded();
let thrown: unknown;
try {
await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:all-overloaded",
runId: "run:all-overloaded",
});
} catch (err) {
thrown = err;
}
expect(thrown).toBeInstanceOf(Error);
expect((thrown as Error).message).toMatch(/^All models failed \(2\): /);
expect((thrown as Error).message).toMatch(
/openai\/mock-1: .* \(overloaded\) \| groq\/mock-2: .* \(overloaded\)/,
);
expect(runEmbeddedAttemptMock).toHaveBeenCalledTimes(2);
expect(computeBackoffMock).not.toHaveBeenCalled();
expect(sleepWithAbortMock).not.toHaveBeenCalled();
});
});
it("probes a provider already in overloaded cooldown before falling back", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
const now = Date.now();
await writeAuthStore(agentDir, {
"openai:p1": {
lastUsed: 1,
cooldownUntil: now + 60_000,
failureCounts: { overloaded: 2 },
},
"groq:p1": { lastUsed: 2 },
});
mockPrimaryOverloadedThenFallbackSuccess();
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:overloaded-probe-fallback",
runId: "run:overloaded-probe-fallback",
});
expect(result.provider).toBe("groq");
expectOpenAiThenGroqAttemptOrder({ expectOpenAiAuthProfileId: "openai:p1" });
});
});
it("persists overloaded cooldown across turns while still allowing one probe and fallback", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
mockPrimaryOverloadedThenFallbackSuccess();
const firstResult = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:overloaded-two-turns:first",
runId: "run:overloaded-two-turns:first",
});
expect(firstResult.provider).toBe("groq");
runEmbeddedAttemptMock.mockClear();
computeBackoffMock.mockClear();
sleepWithAbortMock.mockClear();
mockPrimaryOverloadedThenFallbackSuccess();
const secondResult = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:overloaded-two-turns:second",
runId: "run:overloaded-two-turns:second",
});
expect(secondResult.provider).toBe("groq");
expectOpenAiThenGroqAttemptOrder({ expectOpenAiAuthProfileId: "openai:p1" });
const usageStats = await readUsageStats(agentDir);
expect(typeof usageStats["openai:p1"]?.cooldownUntil).toBe("number");
expectFailureCount(usageStats, "openai:p1", "overloaded", 2);
expect(computeBackoffMock).not.toHaveBeenCalled();
expect(sleepWithAbortMock).not.toHaveBeenCalled();
});
});
it("keeps bare service-unavailable failures in the timeout lane without persisting cooldown", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
mockPrimaryErrorThenFallbackSuccess("LLM error: service unavailable");
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:timeout-cross-provider",
runId: "run:timeout-cross-provider",
});
expect(result.provider).toBe("groq");
expect(result.attempts[0]?.reason).toBe("timeout");
const usageStats = await readUsageStats(agentDir);
expect(usageStats["openai:p1"]?.cooldownUntil).toBeUndefined();
expect(usageStats["openai:p1"]?.failureCounts).toBeUndefined();
expect(computeBackoffMock).not.toHaveBeenCalled();
expect(sleepWithAbortMock).not.toHaveBeenCalled();
});
});
it("rethrows AbortError during overload backoff instead of falling through fallback", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeAuthStore(agentDir);
const controller = new AbortController();
mockPrimaryOverloadedThenFallbackSuccess();
sleepWithAbortMock.mockImplementationOnce(async () => {
controller.abort();
throw new Error("aborted");
});
let thrown: unknown;
try {
await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:overloaded-backoff-abort",
runId: "run:overloaded-backoff-abort",
abortSignal: controller.signal,
config: {
...makeConfig(),
auth: { cooldowns: { overloadedBackoffMs: 321 } },
},
});
} catch (error) {
thrown = error;
}
expect(thrown).toBeInstanceOf(Error);
expect((thrown as Error).name).toBe("AbortError");
expect((thrown as Error).message).toBe("Operation aborted");
expect(runEmbeddedAttemptMock).toHaveBeenCalledTimes(1);
const firstCall = runEmbeddedAttemptMock.mock.calls[0]?.[0] as
| { provider?: string }
| undefined;
expect(firstCall?.provider).toBe("openai");
});
});
it("caps overloaded profile rotations and escalates to cross-provider fallback (#58348)", async () => {
// When a provider has multiple auth profiles and all return overloaded_error,
// the runner should not exhaust all profiles before falling back. It should
// cap profile rotations at overloadedProfileRotations=1 and escalate
// to cross-provider fallback immediately.
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeMultiProfileAuthStore(agentDir);
mockPrimaryOverloadedThenFallbackSuccess();
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:overloaded-multi-profile-cap",
runId: "run:overloaded-multi-profile-cap",
});
// Should fall back to groq instead of exhausting all 3 openai profiles
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expect(result.result.payloads?.[0]?.text ?? "").toContain("fallback ok");
// With overloadedProfileRotations=1, we expect:
// - 1 initial openai attempt (p1)
// - 1 rotation to p2 (capped)
// - escalation to groq (1 attempt)
// Total: 3 attempts, NOT 4 (which would mean all 3 openai profiles tried)
expectProviderAttemptCounts({ openai: 2, groq: 1 });
});
});
it("respects overloadedProfileRotations=0 and falls back immediately", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeMultiProfileAuthStore(agentDir);
mockPrimaryOverloadedThenFallbackSuccess();
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:overloaded-no-rotation",
runId: "run:overloaded-no-rotation",
config: {
...makeConfig(),
auth: { cooldowns: { overloadedProfileRotations: 0 } },
},
});
expect(result.provider).toBe("groq");
expectProviderAttemptCounts({ openai: 1, groq: 1 });
});
});
it("caps rate-limit profile rotations and escalates to cross-provider fallback (#58572)", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeMultiProfileAuthStore(agentDir);
mockPrimaryErrorThenFallbackSuccess(RATE_LIMIT_ERROR_MESSAGE);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:rate-limit-multi-profile-cap",
runId: "run:rate-limit-multi-profile-cap",
});
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expect(result.result.payloads?.[0]?.text ?? "").toContain("fallback ok");
expectProviderAttemptCounts({ openai: 2, groq: 1 });
});
});
it("ignores stale classified rate-limit text when stopReason is not error", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeMultiProfileAuthStore(agentDir);
mockPrimaryStaleRateLimitTextSuccess(RATE_LIMIT_ERROR_MESSAGE);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:rate-limit-retry-limit-fallback",
runId: "run:rate-limit-retry-limit-fallback",
config: {
...makeConfig(),
auth: { cooldowns: { rateLimitedProfileRotations: 999 } },
},
});
expect(result.provider).toBe("openai");
expect(result.model).toBe("mock-1");
expect(result.attempts).toEqual([]);
expectProviderAttemptCounts({ openai: 1, groq: 0 });
});
});
it("respects rateLimitedProfileRotations=0 and falls back immediately", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeMultiProfileAuthStore(agentDir);
mockPrimaryErrorThenFallbackSuccess(RATE_LIMIT_ERROR_MESSAGE);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:rate-limit-no-rotation",
runId: "run:rate-limit-no-rotation",
config: {
...makeConfig(),
auth: { cooldowns: { rateLimitedProfileRotations: 0 } },
},
});
expect(result.provider).toBe("groq");
expectProviderAttemptCounts({ openai: 1, groq: 1 });
});
});
it("caps prompt-side rate-limit profile rotations before cross-provider fallback", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeMultiProfileAuthStore(agentDir);
mockPrimaryPromptErrorThenFallbackSuccess(RATE_LIMIT_ERROR_MESSAGE);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:prompt-rate-limit-multi-profile-cap",
runId: "run:prompt-rate-limit-multi-profile-cap",
});
expect(result.provider).toBe("groq");
expect(result.model).toBe("mock-2");
expectProviderAttemptCounts({ openai: 2, groq: 1 });
});
});
it("respects prompt-side rateLimitedProfileRotations=0 and falls back immediately", async () => {
await withAgentWorkspace(async ({ agentDir, workspaceDir }) => {
await writeMultiProfileAuthStore(agentDir);
mockPrimaryPromptErrorThenFallbackSuccess(RATE_LIMIT_ERROR_MESSAGE);
const result = await runEmbeddedFallback({
agentDir,
workspaceDir,
sessionKey: "agent:test:prompt-rate-limit-no-rotation",
runId: "run:prompt-rate-limit-no-rotation",
config: {
...makeConfig(),
auth: { cooldowns: { rateLimitedProfileRotations: 0 } },
},
});
expect(result.provider).toBe("groq");
expectProviderAttemptCounts({ openai: 1, groq: 1 });
});
});
});