Commit Graph

81 Commits

Author SHA1 Message Date
Peter Steinberger
ffafa9008d test(agents): avoid provider runtime in fallback tests 2026-05-06 10:21:34 +01:00
Peter Steinberger
be1c99b76a test: pass env to fallback metadata snapshot 2026-05-06 05:33:38 +01:00
Peter Steinberger
82c4fd8f56 test: cache fallback metadata snapshot 2026-05-06 05:20:55 +01:00
Peter Steinberger
f35fb7288a test: mock manifest normalization in fallback tests 2026-05-06 04:58:33 +01:00
Peter Steinberger
cc3eb0b53e test: use candidate seam for fallback ordering cases 2026-05-06 01:48:48 +01:00
Peter Steinberger
d111605453 test: streamline model fallback probe coverage 2026-05-06 01:12:16 +01:00
Peter Steinberger
cb42efb6e6 test: trim slow agent fallback coverage 2026-05-06 00:53:27 +01:00
wenxu007
9df0ae6767 fix(agents,failover): propagate sessionId/lane/provider attribution through FailoverError (#73506)
* fix(agents,failover): propagate sessionId/lane/provider attribution through FailoverError

Adds optional `sessionId` and `lane` fields to `FailoverError` and threads
them — together with the existing `provider`, `model`, `profileId` — through
`describeFailoverError` and `coerceToFailoverError` context, so structured
error log ingestion can attribute exhausted-fallback wrapper errors back
to the originating request instead of dropping the per-profile metadata
when the final wrapper is built.

Fixes #42713.

* fix: preserve failover error attribution

---------

Co-authored-by: Altay <altay@uinaf.dev>
2026-05-01 11:26:56 +03:00
Peter Steinberger
aec5efed8d fix(agents): resolve model aliases before fallback 2026-04-28 20:39:58 +01:00
Peter Steinberger
3da4b28d1b fix(agents): avoid overload classification for live model switches 2026-04-27 12:28:33 +01:00
Vincent Koc
43a003b8a0 fix: short-circuit live model switch fallback redirects (#72375) 2026-04-26 14:45:02 -07:00
Vincent Koc
480a3f66c9 fix: shortcut live session model redirects during fallback 2026-04-26 11:14:05 -07:00
Peter Steinberger
90cd9fce85 fix(agents): handle empty Claude stop turns 2026-04-26 03:23:16 +01:00
Vincent Koc
3d554aefdf fix(logging): keep log transport internals private (#71322)
* fix(logging): share transports across module instances

* fix(logging): share transports across module instances

* fix(logging): share transports across module instances

* fix(logging): remove global log transport hooks

* test(agents): capture diagnostic logs after module reset
2026-04-24 23:36:57 -07:00
Peter Steinberger
a16f8dff15 test: fold tiny media fallback specs 2026-04-24 19:01:18 +01:00
EVA
c138368040 feat: add Codex harness extension seams
Co-authored-by: Eva <100yenadmin@users.noreply.github.com>
2026-04-24 09:32:27 +01:00
EVA
40be5ad581 fix: harden GPT-5 runtime paths
Co-authored-by: EVA <100yenadmin@users.noreply.github.com>
2026-04-24 08:55:52 +01:00
Peter Steinberger
3ae15cd746 test: cover codex transport fallback path 2026-04-23 17:58:19 +01:00
Peter Steinberger
5b39be3653 fix(agents): preserve raw fallback schema errors 2026-04-23 07:44:39 +01:00
Peter Steinberger
69c78fbef0 perf(test): dedupe model config fixtures 2026-04-20 12:26:47 +01:00
Peter Steinberger
53239102f8 test: speed up agent model auth tests 2026-04-18 17:42:02 +01:00
Peter Steinberger
aa73df571d perf: narrow auth test mocks 2026-04-18 16:23:00 +01:00
bwjoke
f7422e1fbc fix(failover): detect bare leading 402 assistant errors (#47579)
Merged via squash.

Prepared head SHA: ff336a0d97
Co-authored-by: bwjoke <1284814+bwjoke@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-04-17 22:06:55 +03:00
xiwuqi
7fbd31818b fix: classify invalid-model fallback errors (#50028)
Merged via squash.

Prepared head SHA: 04b13e09e1
Co-authored-by: xiwuqi <64734786+xiwuqi@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-04-14 20:32:29 +03:00
屈定
95ee120a91 fix: classify openrouter json 404 model errors
Rewrites the stale branch on top of current `main` and preserves the original issue as regression coverage for the exact OpenRouter JSON 404 payload from #51571.

No production behavior changes are introduced here; current `main` already classifies this payload as `model_not_found`, and this merge locks that in across the shared matcher, failover classifier, and fallback loop.

Co-authored-by: 屈定 <mrdear@users.noreply.github.com>
Co-authored-by: Altay <altay@uinaf.dev>
2026-04-13 19:53:55 +01:00
Vincent Koc
95517edaeb perf(agents): keep model fallback auth runtime cold 2026-04-13 16:50:30 +01:00
Vincent Koc
bfc77b0f45 perf(agents): keep fallback auth store cold without sources 2026-04-13 15:58:35 +01:00
Neerav Makwana
75deed54f3 Agents: allow cooldown probe for timeout failover reason 2026-04-10 13:52:37 +05:30
Peter Steinberger
fa8723c7e4 test: keep cli reliability and fallback coverage off plugin scans 2026-04-09 04:07:50 +01:00
Peter Steinberger
4b4825b875 test: stabilize model warning sanitizer checks 2026-04-08 11:41:07 +01:00
Shakker
11dbcdc46d refactor: narrow model fallback auth imports 2026-04-03 16:03:10 +01:00
Shakker
fc8ab82aab refactor: trim cron session startup imports 2026-04-03 16:03:10 +01:00
Han Yang
547154865b Fix: live session model switch no longer blocks failover (Resolves #58466) (#58589)
* fix: prevent infinite retry loop when live session model switch blocks failover (#58466)

* fix: remove unused resolveOllamaBaseUrlForRun import after rebase
2026-03-31 21:09:41 -04:00
kiranvk2011
84401223c7 fix: per-model cooldown scope, stepped backoff, and user-facing rate-limit message (#49834)
Merged via squash.

Prepared head SHA: 7c488c070c
Co-authored-by: kiranvk-2011 <91108465+kiranvk-2011@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-25 22:03:49 +03:00
Peter Steinberger
2a06097184 test: update codex test fixtures to gpt-5.4 2026-03-23 02:14:00 -07:00
VibhorGautam
4473242b4f fix: use unknown instead of rate_limit as default cooldown reason (#42911)
Merged via squash.

Prepared head SHA: bebf6704d7
Co-authored-by: VibhorGautam <55019395+VibhorGautam@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-11 21:34:14 +03:00
Charles Dusek
048e25c2b2 fix(agents): avoid duplicate same-provider cooldown probes in fallback runs (#41711)
Merged via squash.

Prepared head SHA: 8be8967bcb
Co-authored-by: cgdusek <38732970+cgdusek@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-10 15:26:47 +03:00
Altay
531e8362b1 Agents: add fallback error observations (#41337)
Merged via squash.

Prepared head SHA: 852469c82f
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-03-10 01:12:10 +03:00
Peter Steinberger
7ab49a7fb7 test(regression): cover recent landed fix paths 2026-03-07 23:07:16 +00:00
Peter Steinberger
e83094e63f fix(agents): warn clearly on unresolved model ids (#39215, thanks @ademczuk)
Co-authored-by: ademczuk <andrew.demczuk@gmail.com>
2026-03-07 22:50:27 +00:00
Altay
6e962d8b9e fix(agents): handle overloaded failover separately (#38301)
* fix(agents): skip auth-profile failure on overload

* fix(agents): note overload auth-profile fallback fix

* fix(agents): classify overloaded failures separately

* fix(agents): back off before overload failover

* fix(agents): tighten overload probe and backoff state

* fix(agents): persist overloaded cooldown across runs

* fix(agents): tighten overloaded status handling

* test(agents): add overload regression coverage

* fix(agents): restore runner imports after rebase

* test(agents): add overload fallback integration coverage

* fix(agents): harden overloaded failover abort handling

* test(agents): tighten overload classifier coverage

* test(agents): cover all-overloaded fallback exhaustion

* fix(cron): retry overloaded fallback summaries

* fix(cron): treat HTTP 529 as overloaded retry
2026-03-07 01:42:11 +03:00
Vignesh Natarajan
d45353f95b fix(agents): honor explicit rate-limit cooldown probes in fallback runs 2026-03-05 20:03:06 -08:00
Altay
49acb07f9f fix(agents): classify insufficient_quota 400s as billing (#36783) 2026-03-06 01:17:48 +03:00
Altay
6859619e98 test(agents): add provider-backed failover regressions (#36735)
* test(agents): add provider-backed failover fixtures

* test(agents): cover more provider error docs

* test(agents): tighten provider doc fixtures
2026-03-06 00:42:59 +03:00
Peter Steinberger
1bd20dbdb6 fix(failover): treat stop reason error as timeout 2026-03-03 01:05:24 +00:00
Peter Steinberger
a2fdc3415f fix(failover): handle unhandled stop reason error 2026-03-03 01:05:24 +00:00
Charles Dusek
92199ac129 fix(agents): unblock gpt-5.3-codex API-key routing and replay (#31083)
* fix(agents): unblock gpt-5.3-codex API-key replay path

* fix(agents): scope OpenAI replay ID rewrites per turn

* test: fix nodes-tool mock typing and reformat telegram accounts
2026-03-02 03:45:12 +00:00
Ayane
5b562e96cb test: add missing ENETRESET test case 2026-03-02 02:08:27 +00:00
Ayane
76ed274aad fix(agents): trigger model failover on connection-refused and network-unreachable errors
Previously, only ETIMEDOUT / ESOCKETTIMEDOUT / ECONNRESET / ECONNABORTED
were recognised as failover-worthy network errors. Connection-level
failures such as ECONNREFUSED (server down), ENETUNREACH / EHOSTUNREACH
(network disconnected), ENETRESET, and EAI_AGAIN (DNS failure) were
treated as unknown errors and did not advance the fallback chain.

This is particularly impactful when a local fallback model (e.g. Ollama)
is configured: if the remote provider is unreachable due to a network
outage, the gateway should fall back to the local model instead of
returning an error to the user.

Add the missing error codes to resolveFailoverReasonFromError() and
corresponding e2e tests.

Closes #18868
2026-03-02 02:08:27 +00:00
Ramez
acbb93be48 fix(agents): comprehensive quota fallback fixes - session overrides + surgical cooldown logic (#23816)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: e6f2b4742b
Co-authored-by: ramezgaberiel <844893+ramezgaberiel@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-25 20:35:40 -05:00