fix(agents): classify generic provider errors for failover (#59325)

* fix(agents): classify generic provider errors for failover

Anthropic returns bare 'An unknown error occurred' during API instability
and OpenRouter wraps upstream failures as 'Provider returned error'. Neither
message was recognized by the failover classifier, so the error surfaced
directly to users instead of triggering the configured fallback chain.

Add both patterns to the serverError classifier so they are classified as
transient server errors (timeout) and trigger model failover.

Closes #49706
Closes #45834

* fix(agents): scope unknown-error failover by provider

* docs(changelog): note provider-scoped unknown-error failover

---------

Co-authored-by: Aaron Zhu <aaron@Aarons-MacBook-Air.local>
Co-authored-by: Altay <altay@uinaf.dev>
This commit is contained in:
Aaron Zhu
2026-04-04 23:11:46 +08:00
committed by GitHub
parent 8a6da9d488
commit 983909f826
8 changed files with 103 additions and 11 deletions

View File

@@ -196,6 +196,38 @@ describe("failover-error", () => {
).toBe("overloaded");
});
it("classifies Anthropic bare 'unknown error' as timeout for failover (#49706)", () => {
expect(
resolveFailoverReasonFromError({
provider: "anthropic",
message: "An unknown error occurred",
}),
).toBe("timeout");
});
it("does not classify generic internal unknown-error text as failover timeout", () => {
expect(
resolveFailoverReasonFromError({
message: "LLM request failed with an unknown error.",
}),
).toBeNull();
expect(
resolveFailoverReasonFromError({
message: "An unknown error occurred",
}),
).toBeNull();
expect(
resolveFailoverReasonFromError({
provider: "openrouter",
message: "An unknown error occurred",
}),
).toBeNull();
expect(
resolveFailoverReasonFromError({
message: "Provider returned error",
}),
).toBeNull();
});
it("treats 400 insufficient_quota payloads as billing instead of format", () => {
expect(
resolveFailoverReasonFromError({