fix: escalate to model fallback after rate-limit profile rotation cap (#58707)

* fix: escalate to model fallback after rate-limit profile rotation cap

Per-model rate limits (e.g. Anthropic Sonnet-only quotas) are not
relieved by rotating auth profiles — if all profiles share the same
model quota, cycling between them loops forever without falling back
to the next model in the configured fallbacks chain.

Apply the same rotation-cap pattern introduced for overloaded_error
(#58348) to rate_limit errors:

- Add `rateLimitedProfileRotations` to auth.cooldowns config (default: 1)
- After N profile rotations on a rate_limit error, throw FailoverError
  to trigger cross-provider model fallback
- Add `resolveRateLimitProfileRotationLimit` helper following the same
  pattern as `resolveOverloadProfileRotationLimit`

Fixes #58572

* fix: cap prompt-side rate-limit failover (#58707) (thanks @Forgely3D)

* fix: restore latest-main gates for #58707

---------

Co-authored-by: Ember (Forgely3D) <ember@forgely.co>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
This commit is contained in:
Forgely3D
2026-04-01 02:54:10 -06:00
committed by GitHub
parent 8fce663861
commit 4fa11632b4
22 changed files with 357 additions and 45 deletions

View File

@@ -7907,6 +7907,23 @@
"help": "Maximum same-provider auth-profile rotations allowed for overloaded errors before switching to model fallback (default: 1).",
"hasChildren": false
},
{
"path": "auth.cooldowns.rateLimitedProfileRotations",
"kind": "core",
"type": "integer",
"required": false,
"deprecated": false,
"sensitive": false,
"tags": [
"access",
"auth",
"performance",
"storage"
],
"label": "Rate-Limited Profile Rotations",
"help": "Maximum same-provider auth-profile rotations allowed for rate-limit errors before switching to model fallback (default: 1).",
"hasChildren": false
},
{
"path": "auth.order",
"kind": "core",

View File

@@ -1,4 +1,4 @@
{"generatedBy":"scripts/generate-config-doc-baseline.ts","recordType":"meta","totalPaths":5729}
{"generatedBy":"scripts/generate-config-doc-baseline.ts","recordType":"meta","totalPaths":5730}
{"recordType":"path","path":"acp","kind":"core","type":"object","required":false,"deprecated":false,"sensitive":false,"tags":["advanced"],"label":"ACP","help":"ACP runtime controls for enabling dispatch, selecting backends, constraining allowed agent targets, and tuning streamed turn projection behavior.","hasChildren":true}
{"recordType":"path","path":"acp.allowedAgents","kind":"core","type":"array","required":false,"deprecated":false,"sensitive":false,"tags":["access"],"label":"ACP Allowed Agents","help":"Allowlist of ACP target agent ids permitted for ACP runtime sessions. Empty means no additional allowlist restriction.","hasChildren":true}
{"recordType":"path","path":"acp.allowedAgents.*","kind":"core","type":"string","required":false,"deprecated":false,"sensitive":false,"tags":[],"hasChildren":false}
@@ -701,6 +701,7 @@
{"recordType":"path","path":"auth.cooldowns.failureWindowHours","kind":"core","type":"number","required":false,"deprecated":false,"sensitive":false,"tags":["access","auth"],"label":"Failover Window (hours)","help":"Failure window (hours) for backoff counters (default: 24).","hasChildren":false}
{"recordType":"path","path":"auth.cooldowns.overloadedBackoffMs","kind":"core","type":"integer","required":false,"deprecated":false,"sensitive":false,"tags":["access","auth","reliability","storage"],"label":"Overloaded Backoff (ms)","help":"Fixed delay in milliseconds before retrying an overloaded provider/profile rotation (default: 0).","hasChildren":false}
{"recordType":"path","path":"auth.cooldowns.overloadedProfileRotations","kind":"core","type":"integer","required":false,"deprecated":false,"sensitive":false,"tags":["access","auth","storage"],"label":"Overloaded Profile Rotations","help":"Maximum same-provider auth-profile rotations allowed for overloaded errors before switching to model fallback (default: 1).","hasChildren":false}
{"recordType":"path","path":"auth.cooldowns.rateLimitedProfileRotations","kind":"core","type":"integer","required":false,"deprecated":false,"sensitive":false,"tags":["access","auth","performance","storage"],"label":"Rate-Limited Profile Rotations","help":"Maximum same-provider auth-profile rotations allowed for rate-limit errors before switching to model fallback (default: 1).","hasChildren":false}
{"recordType":"path","path":"auth.order","kind":"core","type":"object","required":false,"deprecated":false,"sensitive":false,"tags":["access","auth"],"label":"Auth Profile Order","help":"Ordered auth profile IDs per provider (used for automatic failover).","hasChildren":true}
{"recordType":"path","path":"auth.order.*","kind":"core","type":"array","required":false,"deprecated":false,"sensitive":false,"tags":[],"hasChildren":true}
{"recordType":"path","path":"auth.order.*.*","kind":"core","type":"string","required":false,"deprecated":false,"sensitive":false,"tags":[],"hasChildren":false}

View File

@@ -138,10 +138,12 @@ If all profiles for a provider fail, OpenClaw moves to the next model in
`agents.defaults.model.fallbacks`. This applies to auth failures, rate limits, and
timeouts that exhausted profile rotation (other errors do not advance fallback).
Overloaded errors are handled more aggressively than billing cooldowns. By default,
OpenClaw allows one same-provider auth-profile retry, then switches to the next
configured model fallback without waiting. Tune this with
`auth.cooldowns.overloadedProfileRotations` and `auth.cooldowns.overloadedBackoffMs`.
Overloaded and rate-limit errors are handled more aggressively than billing
cooldowns. By default, OpenClaw allows one same-provider auth-profile retry,
then switches to the next configured model fallback without waiting. Tune this
with `auth.cooldowns.overloadedProfileRotations`,
`auth.cooldowns.overloadedBackoffMs`, and
`auth.cooldowns.rateLimitedProfileRotations`.
When a run starts with a model override (hooks or CLI), fallbacks still end at
`agents.defaults.model.primary` after trying any configured fallbacks.
@@ -154,6 +156,7 @@ See [Gateway configuration](/gateway/configuration) for:
- `auth.cooldowns.billingBackoffHours` / `auth.cooldowns.billingBackoffHoursByProvider`
- `auth.cooldowns.billingMaxHours` / `auth.cooldowns.failureWindowHours`
- `auth.cooldowns.overloadedProfileRotations` / `auth.cooldowns.overloadedBackoffMs`
- `auth.cooldowns.rateLimitedProfileRotations`
- `agents.defaults.model.primary` / `agents.defaults.model.fallbacks`
- `agents.defaults.imageModel` routing

View File

@@ -3031,6 +3031,7 @@ Notes:
failureWindowHours: 24,
overloadedProfileRotations: 1,
overloadedBackoffMs: 0,
rateLimitedProfileRotations: 1,
},
},
}
@@ -3042,6 +3043,7 @@ Notes:
- `failureWindowHours`: rolling window in hours used for backoff counters (default: `24`).
- `overloadedProfileRotations`: maximum same-provider auth-profile rotations for overloaded errors before switching to model fallback (default: `1`).
- `overloadedBackoffMs`: fixed delay before retrying an overloaded provider/profile rotation (default: `0`).
- `rateLimitedProfileRotations`: maximum same-provider auth-profile rotations for rate-limit errors before switching to model fallback (default: `1`).
---