Add provider-specific error patterns for AWS Bedrock, Ollama, Mistral,
Cohere, DeepSeek, Together AI, and Cloudflare Workers AI. These providers
return errors in non-standard formats that the generic classifiers miss,
causing incorrect failover behavior (e.g., context overflow misclassified
as format error, ThrottlingException not recognized as rate limit).
Wire provider patterns into isContextOverflowError() and
classifyFailoverReason() as catch-all layers after generic classifiers.
* fix: escalate to model fallback after rate-limit profile rotation cap
Per-model rate limits (e.g. Anthropic Sonnet-only quotas) are not
relieved by rotating auth profiles — if all profiles share the same
model quota, cycling between them loops forever without falling back
to the next model in the configured fallbacks chain.
Apply the same rotation-cap pattern introduced for overloaded_error
(#58348) to rate_limit errors:
- Add `rateLimitedProfileRotations` to auth.cooldowns config (default: 1)
- After N profile rotations on a rate_limit error, throw FailoverError
to trigger cross-provider model fallback
- Add `resolveRateLimitProfileRotationLimit` helper following the same
pattern as `resolveOverloadProfileRotationLimit`
Fixes#58572
* fix: cap prompt-side rate-limit failover (#58707) (thanks @Forgely3D)
* fix: restore latest-main gates for #58707
---------
Co-authored-by: Ember (Forgely3D) <ember@forgely.co>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix: prevent infinite retry loop when live session model switch blocks failover (#58466)
* fix: remove unused resolveOllamaBaseUrlForRun import after rebase
* fix(media): resolve relative MEDIA: paths against agent workspace dir
* fix(agents): remove stale ollama compat import
* fix(media): preserve workspace dir in outbound access
Instead of exponential backoff guesses on Codex 429, probe the WHAM
usage API to determine real availability and write accurate cooldowns.
- Burst/concurrency contention: 15s circuit-break
- Genuine rate limit: proportional to real reset time (capped 2-4h)
- Expired token: 12h cooldown
- Dead account: 24h cooldown
- Probe failure: 30s fail-open
This prevents cascade lockouts when multiple agents share a pool of
Codex profiles, and avoids wasted retries on genuinely exhausted
profiles.
Closes#26329, relates to #1815, #1522, #23996, #54060
Co-authored-by: ryanngit <ryanngit@users.noreply.github.com>
Add toolsAllow field to cron agent-turn payloads, enabling users to
restrict which tool schemas are sent to the model for a given cron job.
When --tools is set:
- Only listed tools are included in the provider request
- promptMode is forced to 'minimal' (strips skills catalog, reply tags,
heartbeat, messaging, docs, memory, model aliases, silent replies)
- Dramatically reduces input tokens for small local models (~16K to ~800)
CLI surface:
- openclaw cron add --tools exec,read,write
- openclaw cron edit <id> --tools exec
- openclaw cron edit <id> --clear-tools (remove allow-list)
Closes#58435
Co-authored-by: andyk-ms <andyk-ms@users.noreply.github.com>
Allow setting provider params (e.g. cacheRetention) once at the
agents.defaults level instead of repeating them per-model. The merge
order is now: defaults.params -> defaults.models[key].params ->
list[agentId].params.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix: Add ollama to NON_PI_NATIVE_MODEL_PROVIDERS to ensure correct model selection
* Add test coverage for ollama provider to ensure models are merged correctly
* Fix test case for ollama provider by adding required model properties
* Changelog: add Ollama model picker fix
* Changelog: move Ollama fix entry to section tail
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com>