mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 08:20:43 +00:00
docs(model-failover): rewrite with Steps for runtime flow and rotation, AccordionGroup for cooldown buckets and chain rules, Tabs for which errors advance fallback
This commit is contained in:
@@ -5,6 +5,7 @@ read_when:
|
||||
- Updating failover rules for auth profiles or models
|
||||
- Understanding how session model overrides interact with fallback retries
|
||||
title: "Model failover"
|
||||
sidebarTitle: "Model failover"
|
||||
---
|
||||
|
||||
OpenClaw handles failures in two stages:
|
||||
@@ -18,29 +19,31 @@ This doc explains the runtime rules and the data that backs them.
|
||||
|
||||
For a normal text run, OpenClaw evaluates candidates in this order:
|
||||
|
||||
1. The currently selected session model.
|
||||
2. Configured `agents.defaults.model.fallbacks` in order.
|
||||
3. The configured primary model at the end when the run started from an override.
|
||||
<Steps>
|
||||
<Step title="Resolve session state">
|
||||
Resolve the active session model and auth-profile preference.
|
||||
</Step>
|
||||
<Step title="Build candidate chain">
|
||||
Build the model candidate chain from the currently selected session model, then `agents.defaults.model.fallbacks` in order, ending with the configured primary when the run started from an override.
|
||||
</Step>
|
||||
<Step title="Try the current provider">
|
||||
Try the current provider with auth-profile rotation/cooldown rules.
|
||||
</Step>
|
||||
<Step title="Advance on failover-worthy errors">
|
||||
If that provider is exhausted with a failover-worthy error, move to the next model candidate.
|
||||
</Step>
|
||||
<Step title="Persist fallback override">
|
||||
Persist the selected fallback override before the retry starts so other session readers see the same provider/model the runner is about to use.
|
||||
</Step>
|
||||
<Step title="Roll back narrowly on failure">
|
||||
If the fallback candidate fails, roll back only the fallback-owned session override fields when they still match that failed candidate.
|
||||
</Step>
|
||||
<Step title="Throw FallbackSummaryError if exhausted">
|
||||
If every candidate fails, throw a `FallbackSummaryError` with per-attempt detail and the soonest cooldown expiry when one is known.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
Inside each candidate, OpenClaw tries auth-profile failover before advancing to
|
||||
the next model candidate.
|
||||
|
||||
High-level sequence:
|
||||
|
||||
1. Resolve the active session model and auth-profile preference.
|
||||
2. Build the model candidate chain.
|
||||
3. Try the current provider with auth-profile rotation/cooldown rules.
|
||||
4. If that provider is exhausted with a failover-worthy error, move to the next
|
||||
model candidate.
|
||||
5. Persist the selected fallback override before the retry starts so other
|
||||
session readers see the same provider/model the runner is about to use.
|
||||
6. If the fallback candidate fails, roll back only the fallback-owned session
|
||||
override fields when they still match that failed candidate.
|
||||
7. If every candidate fails, throw a `FallbackSummaryError` with per-attempt
|
||||
detail and the soonest cooldown expiry when one is known.
|
||||
|
||||
This is intentionally narrower than "save and restore the whole session". The
|
||||
reply runner only persists the model-selection fields it owns for fallback:
|
||||
This is intentionally narrower than "save and restore the whole session". The reply runner only persists the model-selection fields it owns for fallback:
|
||||
|
||||
- `providerOverride`
|
||||
- `modelOverride`
|
||||
@@ -48,9 +51,7 @@ reply runner only persists the model-selection fields it owns for fallback:
|
||||
- `authProfileOverrideSource`
|
||||
- `authProfileOverrideCompactionCount`
|
||||
|
||||
That prevents a failed fallback retry from overwriting newer unrelated session
|
||||
mutations such as manual `/model` changes or session rotation updates that
|
||||
happened while the attempt was running.
|
||||
That prevents a failed fallback retry from overwriting newer unrelated session mutations such as manual `/model` changes or session rotation updates that happened while the attempt was running.
|
||||
|
||||
## Auth storage (keys + OAuth)
|
||||
|
||||
@@ -61,7 +62,7 @@ OpenClaw uses **auth profiles** for both API keys and OAuth tokens.
|
||||
- Config `auth.profiles` / `auth.order` are **metadata + routing only** (no secrets).
|
||||
- Legacy import-only OAuth file: `~/.openclaw/credentials/oauth.json` (imported into `auth-profiles.json` on first use).
|
||||
|
||||
More detail: [/concepts/oauth](/concepts/oauth)
|
||||
More detail: [OAuth](/concepts/oauth)
|
||||
|
||||
Credential types:
|
||||
|
||||
@@ -81,9 +82,17 @@ Profiles live in `~/.openclaw/agents/<agentId>/agent/auth-profiles.json` under `
|
||||
|
||||
When a provider has multiple profiles, OpenClaw chooses an order like this:
|
||||
|
||||
1. **Explicit config**: `auth.order[provider]` (if set).
|
||||
2. **Configured profiles**: `auth.profiles` filtered by provider.
|
||||
3. **Stored profiles**: entries in `auth-profiles.json` for the provider.
|
||||
<Steps>
|
||||
<Step title="Explicit config">
|
||||
`auth.order[provider]` (if set).
|
||||
</Step>
|
||||
<Step title="Configured profiles">
|
||||
`auth.profiles` filtered by provider.
|
||||
</Step>
|
||||
<Step title="Stored profiles">
|
||||
Entries in `auth-profiles.json` for the provider.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
If no explicit order is configured, OpenClaw uses a round‑robin order:
|
||||
|
||||
@@ -93,20 +102,17 @@ If no explicit order is configured, OpenClaw uses a round‑robin order:
|
||||
|
||||
### Session stickiness (cache-friendly)
|
||||
|
||||
OpenClaw **pins the chosen auth profile per session** to keep provider caches warm.
|
||||
It does **not** rotate on every request. The pinned profile is reused until:
|
||||
OpenClaw **pins the chosen auth profile per session** to keep provider caches warm. It does **not** rotate on every request. The pinned profile is reused until:
|
||||
|
||||
- the session is reset (`/new` / `/reset`)
|
||||
- a compaction completes (compaction count increments)
|
||||
- the profile is in cooldown/disabled
|
||||
|
||||
Manual selection via `/model …@<profileId>` sets a **user override** for that session
|
||||
and is not auto‑rotated until a new session starts.
|
||||
Manual selection via `/model …@<profileId>` sets a **user override** for that session and is not auto-rotated until a new session starts.
|
||||
|
||||
Auto‑pinned profiles (selected by the session router) are treated as a **preference**:
|
||||
they are tried first, but OpenClaw may rotate to another profile on rate limits/timeouts.
|
||||
User‑pinned profiles stay locked to that profile; if it fails and model fallbacks
|
||||
are configured, OpenClaw moves to the next model instead of switching profiles.
|
||||
<Note>
|
||||
Auto-pinned profiles (selected by the session router) are treated as a **preference**: they are tried first, but OpenClaw may rotate to another profile on rate limits/timeouts. User-pinned profiles stay locked to that profile; if it fails and model fallbacks are configured, OpenClaw moves to the next model instead of switching profiles.
|
||||
</Note>
|
||||
|
||||
### Why OAuth can "look lost"
|
||||
|
||||
@@ -117,45 +123,31 @@ If you have both an OAuth profile and an API key profile for the same provider,
|
||||
|
||||
## Cooldowns
|
||||
|
||||
When a profile fails due to auth/rate‑limit errors (or a timeout that looks
|
||||
like rate limiting), OpenClaw marks it in cooldown and moves to the next profile.
|
||||
That rate-limit bucket is broader than plain `429`: it also includes provider
|
||||
messages such as `Too many concurrent requests`, `ThrottlingException`,
|
||||
`concurrency limit reached`, `workers_ai ... quota limit exceeded`,
|
||||
`throttled`, `resource exhausted`, and periodic usage-window limits such as
|
||||
`weekly/monthly limit reached`.
|
||||
Format/invalid‑request errors (for example Cloud Code Assist tool call ID
|
||||
validation failures) are treated as failover‑worthy and use the same cooldowns.
|
||||
OpenAI-compatible stop-reason errors such as `Unhandled stop reason: error`,
|
||||
`stop reason: error`, and `reason: error` are classified as timeout/failover
|
||||
signals.
|
||||
Generic server text can also land in that timeout bucket when the source matches
|
||||
a known transient pattern. For example, the bare pi-ai stream-wrapper message
|
||||
`An unknown error occurred` is treated as failover-worthy for every provider
|
||||
because pi-ai emits it when provider streams end with `stopReason: "aborted"` or
|
||||
`stopReason: "error"` without specific details. JSON `api_error` payloads with
|
||||
transient server text such as `internal server error`, `unknown error, 520`,
|
||||
`upstream error`, or `backend error` are also treated as failover-worthy
|
||||
timeouts.
|
||||
OpenRouter-specific generic upstream text such as bare `Provider returned error`
|
||||
is treated as timeout only when the provider context is actually OpenRouter.
|
||||
Generic internal fallback text such as `LLM request failed with an unknown
|
||||
error.` stays conservative and does not trigger failover by itself.
|
||||
When a profile fails due to auth/rate-limit errors (or a timeout that looks like rate limiting), OpenClaw marks it in cooldown and moves to the next profile.
|
||||
|
||||
Some provider SDKs may otherwise sleep for a long `Retry-After` window before
|
||||
returning control to OpenClaw. For Stainless-based SDKs such as Anthropic and
|
||||
OpenAI, OpenClaw caps SDK-internal `retry-after-ms` / `retry-after` waits at 60
|
||||
seconds by default and surfaces longer retryable responses immediately so this
|
||||
failover path can run. Tune or disable the cap with
|
||||
`OPENCLAW_SDK_RETRY_MAX_WAIT_SECONDS`; see [/concepts/retry](/concepts/retry).
|
||||
<AccordionGroup>
|
||||
<Accordion title="What lands in the rate-limit / timeout bucket">
|
||||
That rate-limit bucket is broader than plain `429`: it also includes provider messages such as `Too many concurrent requests`, `ThrottlingException`, `concurrency limit reached`, `workers_ai ... quota limit exceeded`, `throttled`, `resource exhausted`, and periodic usage-window limits such as `weekly/monthly limit reached`.
|
||||
|
||||
Rate-limit cooldowns can also be model-scoped:
|
||||
Format/invalid-request errors (for example Cloud Code Assist tool call ID validation failures) are treated as failover-worthy and use the same cooldowns. OpenAI-compatible stop-reason errors such as `Unhandled stop reason: error`, `stop reason: error`, and `reason: error` are classified as timeout/failover signals.
|
||||
|
||||
- OpenClaw records `cooldownModel` for rate-limit failures when the failing
|
||||
model id is known.
|
||||
- A sibling model on the same provider can still be tried when the cooldown is
|
||||
scoped to a different model.
|
||||
- Billing/disabled windows still block the whole profile across models.
|
||||
Generic server text can also land in that timeout bucket when the source matches a known transient pattern. For example, the bare pi-ai stream-wrapper message `An unknown error occurred` is treated as failover-worthy for every provider because pi-ai emits it when provider streams end with `stopReason: "aborted"` or `stopReason: "error"` without specific details. JSON `api_error` payloads with transient server text such as `internal server error`, `unknown error, 520`, `upstream error`, or `backend error` are also treated as failover-worthy timeouts.
|
||||
|
||||
OpenRouter-specific generic upstream text such as bare `Provider returned error` is treated as timeout only when the provider context is actually OpenRouter. Generic internal fallback text such as `LLM request failed with an unknown error.` stays conservative and does not trigger failover by itself.
|
||||
|
||||
</Accordion>
|
||||
<Accordion title="SDK retry-after caps">
|
||||
Some provider SDKs may otherwise sleep for a long `Retry-After` window before returning control to OpenClaw. For Stainless-based SDKs such as Anthropic and OpenAI, OpenClaw caps SDK-internal `retry-after-ms` / `retry-after` waits at 60 seconds by default and surfaces longer retryable responses immediately so this failover path can run. Tune or disable the cap with `OPENCLAW_SDK_RETRY_MAX_WAIT_SECONDS`; see [Retry behavior](/concepts/retry).
|
||||
</Accordion>
|
||||
<Accordion title="Model-scoped cooldowns">
|
||||
Rate-limit cooldowns can also be model-scoped:
|
||||
|
||||
- OpenClaw records `cooldownModel` for rate-limit failures when the failing model id is known.
|
||||
- A sibling model on the same provider can still be tried when the cooldown is scoped to a different model.
|
||||
- Billing/disabled windows still block the whole profile across models.
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
Cooldowns use exponential backoff:
|
||||
|
||||
@@ -180,18 +172,13 @@ State is stored in `auth-state.json` under `usageStats`:
|
||||
|
||||
## Billing disables
|
||||
|
||||
Billing/credit failures (for example “insufficient credits” / “credit balance too low”) are treated as failover‑worthy, but they’re usually not transient. Instead of a short cooldown, OpenClaw marks the profile as **disabled** (with a longer backoff) and rotates to the next profile/provider.
|
||||
Billing/credit failures (for example "insufficient credits" / "credit balance too low") are treated as failover-worthy, but they're usually not transient. Instead of a short cooldown, OpenClaw marks the profile as **disabled** (with a longer backoff) and rotates to the next profile/provider.
|
||||
|
||||
Not every billing-shaped response is `402`, and not every HTTP `402` lands
|
||||
here. OpenClaw keeps explicit billing text in the billing lane even when a
|
||||
provider returns `401` or `403` instead, but provider-specific matchers stay
|
||||
scoped to the provider that owns them (for example OpenRouter `403 Key limit
|
||||
exceeded`). Meanwhile temporary `402` usage-window and
|
||||
organization/workspace spend-limit errors are classified as `rate_limit` when
|
||||
the message looks retryable (for example `weekly usage limit exhausted`, `daily
|
||||
limit reached, resets tomorrow`, or `organization spending limit exceeded`).
|
||||
Those stay on the short cooldown/failover path instead of the long
|
||||
billing-disable path.
|
||||
<Note>
|
||||
Not every billing-shaped response is `402`, and not every HTTP `402` lands here. OpenClaw keeps explicit billing text in the billing lane even when a provider returns `401` or `403` instead, but provider-specific matchers stay scoped to the provider that owns them (for example OpenRouter `403 Key limit exceeded`).
|
||||
|
||||
Meanwhile temporary `402` usage-window and organization/workspace spend-limit errors are classified as `rate_limit` when the message looks retryable (for example `weekly usage limit exhausted`, `daily limit reached, resets tomorrow`, or `organization spending limit exceeded`). Those stay on the short cooldown/failover path instead of the long billing-disable path.
|
||||
</Note>
|
||||
|
||||
State is stored in `auth-state.json`:
|
||||
|
||||
@@ -209,139 +196,114 @@ State is stored in `auth-state.json`:
|
||||
Defaults:
|
||||
|
||||
- Billing backoff starts at **5 hours**, doubles per billing failure, and caps at **24 hours**.
|
||||
- Backoff counters reset if the profile hasn’t failed for **24 hours** (configurable).
|
||||
- Backoff counters reset if the profile hasn't failed for **24 hours** (configurable).
|
||||
- Overloaded retries allow **1 same-provider profile rotation** before model fallback.
|
||||
- Overloaded retries use **0 ms backoff** by default.
|
||||
|
||||
## Model fallback
|
||||
|
||||
If all profiles for a provider fail, OpenClaw moves to the next model in
|
||||
`agents.defaults.model.fallbacks`. This applies to auth failures, rate limits, and
|
||||
timeouts that exhausted profile rotation (other errors do not advance fallback).
|
||||
If all profiles for a provider fail, OpenClaw moves to the next model in `agents.defaults.model.fallbacks`. This applies to auth failures, rate limits, and timeouts that exhausted profile rotation (other errors do not advance fallback).
|
||||
|
||||
Overloaded and rate-limit errors are handled more aggressively than billing
|
||||
cooldowns. By default, OpenClaw allows one same-provider auth-profile retry,
|
||||
then switches to the next configured model fallback without waiting.
|
||||
Provider-busy signals such as `ModelNotReadyException` land in that overloaded
|
||||
bucket. Tune this with `auth.cooldowns.overloadedProfileRotations`,
|
||||
`auth.cooldowns.overloadedBackoffMs`, and
|
||||
`auth.cooldowns.rateLimitedProfileRotations`.
|
||||
Overloaded and rate-limit errors are handled more aggressively than billing cooldowns. By default, OpenClaw allows one same-provider auth-profile retry, then switches to the next configured model fallback without waiting. Provider-busy signals such as `ModelNotReadyException` land in that overloaded bucket. Tune this with `auth.cooldowns.overloadedProfileRotations`, `auth.cooldowns.overloadedBackoffMs`, and `auth.cooldowns.rateLimitedProfileRotations`.
|
||||
|
||||
When a run starts with a model override (hooks or CLI), fallbacks still end at
|
||||
`agents.defaults.model.primary` after trying any configured fallbacks.
|
||||
When a run starts with a model override (hooks or CLI), fallbacks still end at `agents.defaults.model.primary` after trying any configured fallbacks.
|
||||
|
||||
### Candidate chain rules
|
||||
|
||||
OpenClaw builds the candidate list from the currently requested `provider/model`
|
||||
plus configured fallbacks.
|
||||
OpenClaw builds the candidate list from the currently requested `provider/model` plus configured fallbacks.
|
||||
|
||||
Rules:
|
||||
|
||||
- The requested model is always first.
|
||||
- Explicit configured fallbacks are deduplicated but not filtered by the model
|
||||
allowlist. They are treated as explicit operator intent.
|
||||
- If the current run is already on a configured fallback in the same provider
|
||||
family, OpenClaw keeps using the full configured chain.
|
||||
- If the current run is on a different provider than config and that current
|
||||
model is not already part of the configured fallback chain, OpenClaw does not
|
||||
append unrelated configured fallbacks from another provider.
|
||||
- When the run started from an override, the configured primary is appended at
|
||||
the end so the chain can settle back onto the normal default once earlier
|
||||
candidates are exhausted.
|
||||
<AccordionGroup>
|
||||
<Accordion title="Rules">
|
||||
- The requested model is always first.
|
||||
- Explicit configured fallbacks are deduplicated but not filtered by the model allowlist. They are treated as explicit operator intent.
|
||||
- If the current run is already on a configured fallback in the same provider family, OpenClaw keeps using the full configured chain.
|
||||
- If the current run is on a different provider than config and that current model is not already part of the configured fallback chain, OpenClaw does not append unrelated configured fallbacks from another provider.
|
||||
- When the run started from an override, the configured primary is appended at the end so the chain can settle back onto the normal default once earlier candidates are exhausted.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### Which errors advance fallback
|
||||
|
||||
Model fallback continues on:
|
||||
|
||||
- auth failures
|
||||
- rate limits and cooldown exhaustion
|
||||
- overloaded/provider-busy errors
|
||||
- timeout-shaped failover errors
|
||||
- billing disables
|
||||
- `LiveSessionModelSwitchError`, which is normalized into a failover path so a
|
||||
stale persisted model does not create an outer retry loop
|
||||
- other unrecognized errors when there are still remaining candidates
|
||||
|
||||
Model fallback does not continue on:
|
||||
|
||||
- explicit aborts that are not timeout/failover-shaped
|
||||
- context overflow errors that should stay inside compaction/retry logic
|
||||
(for example `request_too_large`, `INVALID_ARGUMENT: input exceeds the maximum
|
||||
number of tokens`, `input token count exceeds the maximum number of input
|
||||
tokens`, `The input is too long for the model`, or `ollama error: context
|
||||
length exceeded`)
|
||||
- a final unknown error when there are no candidates left
|
||||
<Tabs>
|
||||
<Tab title="Continues on">
|
||||
- auth failures
|
||||
- rate limits and cooldown exhaustion
|
||||
- overloaded/provider-busy errors
|
||||
- timeout-shaped failover errors
|
||||
- billing disables
|
||||
- `LiveSessionModelSwitchError`, which is normalized into a failover path so a stale persisted model does not create an outer retry loop
|
||||
- other unrecognized errors when there are still remaining candidates
|
||||
</Tab>
|
||||
<Tab title="Does not continue on">
|
||||
- explicit aborts that are not timeout/failover-shaped
|
||||
- context overflow errors that should stay inside compaction/retry logic (for example `request_too_large`, `INVALID_ARGUMENT: input exceeds the maximum number of tokens`, `input token count exceeds the maximum number of input tokens`, `The input is too long for the model`, or `ollama error: context length exceeded`)
|
||||
- a final unknown error when there are no candidates left
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
### Cooldown skip vs probe behavior
|
||||
|
||||
When every auth profile for a provider is already in cooldown, OpenClaw does
|
||||
not automatically skip that provider forever. It makes a per-candidate decision:
|
||||
When every auth profile for a provider is already in cooldown, OpenClaw does not automatically skip that provider forever. It makes a per-candidate decision:
|
||||
|
||||
- Persistent auth failures skip the whole provider immediately.
|
||||
- Billing disables usually skip, but the primary candidate can still be probed
|
||||
on a throttle so recovery is possible without restarting.
|
||||
- The primary candidate may be probed near cooldown expiry, with a per-provider
|
||||
throttle.
|
||||
- Same-provider fallback siblings can be attempted despite cooldown when the
|
||||
failure looks transient (`rate_limit`, `overloaded`, or unknown). This is
|
||||
especially relevant when a rate limit is model-scoped and a sibling model may
|
||||
still recover immediately.
|
||||
- Transient cooldown probes are limited to one per provider per fallback run so
|
||||
a single provider does not stall cross-provider fallback.
|
||||
<AccordionGroup>
|
||||
<Accordion title="Per-candidate decisions">
|
||||
- Persistent auth failures skip the whole provider immediately.
|
||||
- Billing disables usually skip, but the primary candidate can still be probed on a throttle so recovery is possible without restarting.
|
||||
- The primary candidate may be probed near cooldown expiry, with a per-provider throttle.
|
||||
- Same-provider fallback siblings can be attempted despite cooldown when the failure looks transient (`rate_limit`, `overloaded`, or unknown). This is especially relevant when a rate limit is model-scoped and a sibling model may still recover immediately.
|
||||
- Transient cooldown probes are limited to one per provider per fallback run so a single provider does not stall cross-provider fallback.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Session overrides and live model switching
|
||||
|
||||
Session model changes are shared state. The active runner, `/model` command,
|
||||
compaction/session updates, and live-session reconciliation all read or write
|
||||
parts of the same session entry.
|
||||
Session model changes are shared state. The active runner, `/model` command, compaction/session updates, and live-session reconciliation all read or write parts of the same session entry.
|
||||
|
||||
That means fallback retries have to coordinate with live model switching:
|
||||
|
||||
- Only explicit user-driven model changes mark a pending live switch. That
|
||||
includes `/model`, `session_status(model=...)`, and `sessions.patch`.
|
||||
- System-driven model changes such as fallback rotation, heartbeat overrides,
|
||||
or compaction never mark a pending live switch on their own.
|
||||
- Before a fallback retry starts, the reply runner persists the selected
|
||||
fallback override fields to the session entry.
|
||||
- Live-session reconciliation prefers persisted session overrides over stale
|
||||
runtime model fields.
|
||||
- If the fallback attempt fails, the runner rolls back only the override fields
|
||||
it wrote, and only if they still match that failed candidate.
|
||||
- Only explicit user-driven model changes mark a pending live switch. That includes `/model`, `session_status(model=...)`, and `sessions.patch`.
|
||||
- System-driven model changes such as fallback rotation, heartbeat overrides, or compaction never mark a pending live switch on their own.
|
||||
- Before a fallback retry starts, the reply runner persists the selected fallback override fields to the session entry.
|
||||
- Live-session reconciliation prefers persisted session overrides over stale runtime model fields.
|
||||
- If the fallback attempt fails, the runner rolls back only the override fields it wrote, and only if they still match that failed candidate.
|
||||
|
||||
This prevents the classic race:
|
||||
|
||||
1. Primary fails.
|
||||
2. Fallback candidate is chosen in memory.
|
||||
3. Session store still says the old primary.
|
||||
4. Live-session reconciliation reads the stale session state.
|
||||
5. The retry gets snapped back to the old model before the fallback attempt
|
||||
starts.
|
||||
<Steps>
|
||||
<Step title="Primary fails">
|
||||
The selected primary model fails.
|
||||
</Step>
|
||||
<Step title="Fallback chosen in memory">
|
||||
Fallback candidate is chosen in memory.
|
||||
</Step>
|
||||
<Step title="Session store still says old primary">
|
||||
Session store still reflects the old primary.
|
||||
</Step>
|
||||
<Step title="Live reconciliation reads stale state">
|
||||
Live-session reconciliation reads the stale session state.
|
||||
</Step>
|
||||
<Step title="Retry snapped back">
|
||||
The retry gets snapped back to the old model before the fallback attempt starts.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
The persisted fallback override closes that window, and the narrow rollback
|
||||
keeps newer manual or runtime session changes intact.
|
||||
The persisted fallback override closes that window, and the narrow rollback keeps newer manual or runtime session changes intact.
|
||||
|
||||
## Observability and failure summaries
|
||||
|
||||
`runWithModelFallback(...)` records per-attempt details that feed logs and
|
||||
user-facing cooldown messaging:
|
||||
`runWithModelFallback(...)` records per-attempt details that feed logs and user-facing cooldown messaging:
|
||||
|
||||
- provider/model attempted
|
||||
- reason (`rate_limit`, `overloaded`, `billing`, `auth`, `model_not_found`, and
|
||||
similar failover reasons)
|
||||
- reason (`rate_limit`, `overloaded`, `billing`, `auth`, `model_not_found`, and similar failover reasons)
|
||||
- optional status/code
|
||||
- human-readable error summary
|
||||
|
||||
When every candidate fails, OpenClaw throws `FallbackSummaryError`. The outer
|
||||
reply runner can use that to build a more specific message such as "all models
|
||||
are temporarily rate-limited" and include the soonest cooldown expiry when one
|
||||
is known.
|
||||
When every candidate fails, OpenClaw throws `FallbackSummaryError`. The outer reply runner can use that to build a more specific message such as "all models are temporarily rate-limited" and include the soonest cooldown expiry when one is known.
|
||||
|
||||
That cooldown summary is model-aware:
|
||||
|
||||
- unrelated model-scoped rate limits are ignored for the attempted
|
||||
provider/model chain
|
||||
- if the remaining block is a matching model-scoped rate limit, OpenClaw
|
||||
reports the last matching expiry that still blocks that model
|
||||
- unrelated model-scoped rate limits are ignored for the attempted provider/model chain
|
||||
- if the remaining block is a matching model-scoped rate limit, OpenClaw reports the last matching expiry that still blocks that model
|
||||
|
||||
## Related config
|
||||
|
||||
|
||||
Reference in New Issue
Block a user