docs: expand model fallback guide

This commit is contained in:
Peter Steinberger
2026-04-04 09:13:33 +09:00
parent 5bea93fd63
commit d01cb5ecc6
2 changed files with 137 additions and 0 deletions

View File

@@ -3,6 +3,7 @@ summary: "How OpenClaw rotates auth profiles and falls back across models"
read_when:
- Diagnosing auth profile rotation, cooldowns, or model fallback behavior
- Updating failover rules for auth profiles or models
- Understanding how session model overrides interact with fallback retries
title: "Model Failover"
---
@@ -15,6 +16,44 @@ OpenClaw handles failures in two stages:
This doc explains the runtime rules and the data that backs them.
## Runtime flow
For a normal text run, OpenClaw evaluates candidates in this order:
1. The currently selected session model.
2. Configured `agents.defaults.model.fallbacks` in order.
3. The configured primary model at the end when the run started from an override.
Inside each candidate, OpenClaw tries auth-profile failover before advancing to
the next model candidate.
High-level sequence:
1. Resolve the active session model and auth-profile preference.
2. Build the model candidate chain.
3. Try the current provider with auth-profile rotation/cooldown rules.
4. If that provider is exhausted with a failover-worthy error, move to the next
model candidate.
5. Persist the selected fallback override before the retry starts so other
session readers see the same provider/model the runner is about to use.
6. If the fallback candidate fails, roll back only the fallback-owned session
override fields when they still match that failed candidate.
7. If every candidate fails, throw a `FallbackSummaryError` with per-attempt
detail and the soonest cooldown expiry when one is known.
This is intentionally narrower than "save and restore the whole session". The
reply runner only persists the model-selection fields it owns for fallback:
- `providerOverride`
- `modelOverride`
- `authProfileOverride`
- `authProfileOverrideSource`
- `authProfileOverrideCompactionCount`
That prevents a failed fallback retry from overwriting newer unrelated session
mutations such as manual `/model` changes or session rotation updates that
happened while the attempt was running.
## Auth storage (keys + OAuth)
OpenClaw uses **auth profiles** for both API keys and OAuth tokens.
@@ -148,6 +187,102 @@ with `auth.cooldowns.overloadedProfileRotations`,
When a run starts with a model override (hooks or CLI), fallbacks still end at
`agents.defaults.model.primary` after trying any configured fallbacks.
### Candidate chain rules
OpenClaw builds the candidate list from the currently requested `provider/model`
plus configured fallbacks.
Rules:
- The requested model is always first.
- Explicit configured fallbacks are deduplicated but not filtered by the model
allowlist. They are treated as explicit operator intent.
- If the current run is already on a configured fallback in the same provider
family, OpenClaw keeps using the full configured chain.
- If the current run is on a different provider than config and that current
model is not already part of the configured fallback chain, OpenClaw does not
append unrelated configured fallbacks from another provider.
- When the run started from an override, the configured primary is appended at
the end so the chain can settle back onto the normal default once earlier
candidates are exhausted.
### Which errors advance fallback
Model fallback continues on:
- auth failures
- rate limits and cooldown exhaustion
- overloaded/provider-busy errors
- timeout-shaped failover errors
- billing disables
- `LiveSessionModelSwitchError`, which is normalized into a failover path so a
stale persisted model does not create an outer retry loop
- other unrecognized errors when there are still remaining candidates
Model fallback does not continue on:
- explicit aborts that are not timeout/failover-shaped
- context overflow errors that should stay inside compaction/retry logic
- a final unknown error when there are no candidates left
### Cooldown skip vs probe behavior
When every auth profile for a provider is already in cooldown, OpenClaw does
not automatically skip that provider forever. It makes a per-candidate decision:
- Persistent auth failures skip the whole provider immediately.
- Billing disables usually skip, but the primary candidate can still be probed
on a throttle so recovery is possible without restarting.
- The primary candidate may be probed near cooldown expiry, with a per-provider
throttle.
- Same-provider fallback siblings can be attempted despite cooldown when the
failure looks transient (`rate_limit`, `overloaded`, or unknown).
- Transient cooldown probes are limited to one per provider per fallback run so
a single provider does not stall cross-provider fallback.
## Session overrides and live model switching
Session model changes are shared state. The active runner, `/model` command,
compaction/session updates, and live-session reconciliation all read or write
parts of the same session entry.
That means fallback retries have to coordinate with live model switching:
- Before a fallback retry starts, the reply runner persists the selected
fallback override fields to the session entry.
- Live-session reconciliation prefers persisted session overrides over stale
runtime model fields.
- If the fallback attempt fails, the runner rolls back only the override fields
it wrote, and only if they still match that failed candidate.
This prevents the classic race:
1. Primary fails.
2. Fallback candidate is chosen in memory.
3. Session store still says the old primary.
4. Live-session reconciliation reads the stale session state.
5. The retry gets snapped back to the old model before the fallback attempt
starts.
The persisted fallback override closes that window, and the narrow rollback
keeps newer manual or runtime session changes intact.
## Observability and failure summaries
`runWithModelFallback(...)` records per-attempt details that feed logs and
user-facing cooldown messaging:
- provider/model attempted
- reason (`rate_limit`, `overloaded`, `billing`, `auth`, `model_not_found`, and
similar failover reasons)
- optional status/code
- human-readable error summary
When every candidate fails, OpenClaw throws `FallbackSummaryError`. The outer
reply runner can use that to build a more specific message such as "all models
are temporarily rate-limited" and include the soonest cooldown expiry when one
is known.
## Related config
See [Gateway configuration](/gateway/configuration) for:

View File

@@ -16,6 +16,8 @@ For model selection rules, see [/concepts/models](/concepts/models).
- Model refs use `provider/model` (example: `opencode/claude-opus-4-6`).
- If you set `agents.defaults.models`, it becomes the allowlist.
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
- Fallback runtime rules, cooldown probes, and session-override persistence are
documented in [/concepts/model-failover](/concepts/model-failover).
- Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
OpenClaw merges that output into `models.providers` before writing
`models.json`.