diff --git a/docs/concepts/model-failover.md b/docs/concepts/model-failover.md index 18cbcb5120b..3ce2cf1cc18 100644 --- a/docs/concepts/model-failover.md +++ b/docs/concepts/model-failover.md @@ -155,12 +155,15 @@ State is stored in `auth-profiles.json` under `usageStats`: Billing/credit failures (for example “insufficient credits” / “credit balance too low”) are treated as failover‑worthy, but they’re usually not transient. Instead of a short cooldown, OpenClaw marks the profile as **disabled** (with a longer backoff) and rotates to the next profile/provider. -Not every HTTP `402` lands here. OpenClaw classifies temporary `402` usage-window -and organization/workspace spend-limit errors as `rate_limit` when the message -looks retryable (for example `weekly usage limit exhausted`, `daily limit -reached, resets tomorrow`, or `organization spending limit exceeded`). Those -stay on the short cooldown/failover path instead of the long billing-disable -path. +Not every billing-shaped response is `402`, and not every HTTP `402` lands +here. OpenClaw keeps explicit billing text in the billing lane even when a +provider returns `401` or `403` instead (for example OpenRouter `403 Key limit +exceeded`). Meanwhile temporary `402` usage-window and +organization/workspace spend-limit errors are classified as `rate_limit` when +the message looks retryable (for example `weekly usage limit exhausted`, `daily +limit reached, resets tomorrow`, or `organization spending limit exceeded`). +Those stay on the short cooldown/failover path instead of the long +billing-disable path. State is stored in `auth-profiles.json`: diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md index 6a374a7426a..46f472750b4 100644 --- a/docs/gateway/configuration-reference.md +++ b/docs/gateway/configuration-reference.md @@ -3130,9 +3130,11 @@ Notes: ``` - `billingBackoffHours`: base backoff in hours when a profile fails due to true - billing/insufficient-credit errors (default: `5`). Retryable HTTP `402` - usage-window or organization/workspace spend-limit messages stay in the - `rate_limit` path instead. + billing/insufficient-credit errors (default: `5`). Explicit billing text can + still land here even on `401`/`403` responses (for example OpenRouter + `Key limit exceeded`). Retryable HTTP `402` usage-window or + organization/workspace spend-limit messages stay in the `rate_limit` path + instead. - `billingBackoffHoursByProvider`: optional per-provider overrides for billing backoff hours. - `billingMaxHours`: cap in hours for billing backoff exponential growth (default: `24`). - `authPermanentBackoffMinutes`: base backoff in minutes for high-confidence `auth_permanent` failures (default: `10`). diff --git a/docs/help/faq.md b/docs/help/faq.md index f21b5fa83d8..cbbba0d62d2 100644 --- a/docs/help/faq.md +++ b/docs/help/faq.md @@ -2419,10 +2419,14 @@ for usage/billing and raise limits as needed. `ThrottlingException`, `resource exhausted`, and periodic usage-window limits (`weekly/monthly limit reached`) as failover-worthy rate limits. - Some HTTP `402` responses also stay in that transient bucket. If the - message looks like a retryable usage-window or organization/workspace spend - limit (`daily limit reached, resets tomorrow`, `organization spending limit - exceeded`), OpenClaw treats it as `rate_limit`, not a long billing disable. + Some billing-looking responses are not `402`, and some HTTP `402` + responses also stay in that transient bucket. If a provider returns + explicit billing text on `401` or `403` (for example OpenRouter + `Key limit exceeded`), OpenClaw keeps that in the billing lane. If a `402` + message instead looks like a retryable usage-window or + organization/workspace spend limit (`daily limit reached, resets tomorrow`, + `organization spending limit exceeded`), OpenClaw treats it as + `rate_limit`, not a long billing disable. Context-overflow errors are different: signatures such as `request_too_large`, `input exceeds the maximum number of tokens`, or