fix: escalate to model fallback after rate-limit profile rotation cap (#58707)

* fix: escalate to model fallback after rate-limit profile rotation cap

Per-model rate limits (e.g. Anthropic Sonnet-only quotas) are not
relieved by rotating auth profiles — if all profiles share the same
model quota, cycling between them loops forever without falling back
to the next model in the configured fallbacks chain.

Apply the same rotation-cap pattern introduced for overloaded_error
(#58348) to rate_limit errors:

- Add `rateLimitedProfileRotations` to auth.cooldowns config (default: 1)
- After N profile rotations on a rate_limit error, throw FailoverError
  to trigger cross-provider model fallback
- Add `resolveRateLimitProfileRotationLimit` helper following the same
  pattern as `resolveOverloadProfileRotationLimit`

Fixes #58572

* fix: cap prompt-side rate-limit failover (#58707) (thanks @Forgely3D)

* fix: restore latest-main gates for #58707

---------

Co-authored-by: Ember (Forgely3D) <ember@forgely.co>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
This commit is contained in:
Forgely3D
2026-04-01 02:54:10 -06:00
committed by GitHub
parent 8fce663861
commit 4fa11632b4
22 changed files with 357 additions and 45 deletions

View File

@@ -90,9 +90,12 @@ export function resolveNpmDistTagMirrorAuth(params?: {
nodeAuthToken?: string | null;
npmToken?: string | null;
}): NpmDistTagMirrorAuth {
const nodeAuthToken =
params && "nodeAuthToken" in params ? params.nodeAuthToken : process.env.NODE_AUTH_TOKEN;
const npmToken = params && "npmToken" in params ? params.npmToken : process.env.NPM_TOKEN;
return resolveNpmDistTagMirrorAuthBase({
nodeAuthToken: params?.nodeAuthToken ?? process.env.NODE_AUTH_TOKEN,
npmToken: params?.npmToken ?? process.env.NPM_TOKEN,
nodeAuthToken,
npmToken,
}) as NpmDistTagMirrorAuth;
}