* fix: escalate to model fallback after rate-limit profile rotation cap
Per-model rate limits (e.g. Anthropic Sonnet-only quotas) are not
relieved by rotating auth profiles — if all profiles share the same
model quota, cycling between them loops forever without falling back
to the next model in the configured fallbacks chain.
Apply the same rotation-cap pattern introduced for overloaded_error
(#58348) to rate_limit errors:
- Add `rateLimitedProfileRotations` to auth.cooldowns config (default: 1)
- After N profile rotations on a rate_limit error, throw FailoverError
to trigger cross-provider model fallback
- Add `resolveRateLimitProfileRotationLimit` helper following the same
pattern as `resolveOverloadProfileRotationLimit`
Fixes#58572
* fix: cap prompt-side rate-limit failover (#58707) (thanks @Forgely3D)
* fix: restore latest-main gates for #58707
---------
Co-authored-by: Ember (Forgely3D) <ember@forgely.co>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
- Fix CWE-209: use static safe message instead of raw provider error text
- Fix CWE-117: sanitize provider/model in logs via sanitizeForLog
- Hide CLI hints from external channels via shouldSurfaceToControlUi
- Move overload cap check before advanceAuthProfile to save setup latency
- Export MAX_LIVE_SWITCH_RETRIES as module-level constant
- Use exact toBe() assertions in tests
- Correct failover decision label to fallback_model
- Add MAX_LIVE_SWITCH_RETRIES=2 guard in agent-runner-execution.ts
- Add MAX_OVERLOAD_PROFILE_ROTATIONS=1 cap in run.ts for overloaded errors
- Return kind:final with user-visible error on retry exhaustion
- Escalate to cross-provider fallback instead of exhausting same-provider profiles
Fixes#58348
* feat(gateway): implement claim check pattern to prevent OOM on large attachments
* fix: sanitize mediaId, refine trimEnd, remove warn log, add threshold and absolute path
* fix: enforce maxBytes before decoding and use dynamic path from saveMediaBuffer
* fix: enforce absolute maxBytes limit before Buffer allocation and preserve file extensions
* fix: align saveMediaBuffer arguments and satisfy oxfmt linter
* chore: strictly enforce linting rules (curly braces, unused vars, and error typing)
* fix: restrict offload to mainstream mimes to avoid extension-loss bug in store.ts for BMP/TIFF
* fix: restrict offload to mainstream mimes to bypass store.ts extension-loss bug
* chore: document bmp/tiff exclusion from offload whitelist in MIME_TO_EXT
* feat: implement agent-side resolver for opaque media URIs and finalize contract
* fix: support unicode media URIs and allow consecutive dots in safe IDs based on Codex review
* fix(gateway): enforce strict fail-fast for oversized media to prevent OOM bypass
* refactor(gateway): harden media offload with performance and security optimizations
This update refines the Claim Check pattern with industrial-grade guards:
- Performance: Implemented sampled Base64 validation for large payloads (>4KB) to prevent event loop blocking.
- Security: Added null-byte (\u0000) detection and reinforced path traversal guards.
- I18n: Updated media-uri regex to a blacklist-based character class for Unicode/Chinese filename support, with oxlint bypass for intentional control regex.
- Robustness: Enhanced error diagnostics with JSON-serialized IDs.
* fix: add HEIC/HEIF to offload allowlist and pass maxBytes to saveMediaBuffer
* fix(gateway): clean up offloaded media files on attachment parse failure
Address Codex review feedback: track saved media IDs and implement best-effort cleanup via deleteMediaBuffer if subsequent attachments fail validation, preventing orphaned files on disk.
* fix(gateway): enforce full base64 validation to prevent whitespace padding bypass
Address Codex review feedback: remove early return in isValidBase64 so padded payloads cannot bypass offload thresholds and reintroduce memory pressure. Updated related comments.
* fix(gateway): preserve offloaded media metadata and fix validation error mapping
Address Codex review feedback:
- Add \offloadedRefs\ to \ParsedMessageWithImages\ to expose structured metadata for offloaded attachments, preventing transcript media loss.
- Move \erifyDecodedSize\ outside the storage try-catch block to correctly surface client base64 validation failures as 4xx errors instead of 5xx \MediaOffloadError\.
- Add JSDoc TODOs indicating that upstream callers (chat.ts, agent.ts, server-node-events.ts) must explicitly pass the \supportsImages\ flag.
* fix(agents): explicitly allow media store dir when loading offloaded images
Address Codex review feedback: Pass getMediaDir() to loadWebMedia's localRoots for media-uri refs to prevent legacy path resolution mismatches from silently dropping large attachments.
* fix(gateway): resolve attachment offload regressions and error mapping
Address Codex review feedback:
- Pass \supportsImages\ dynamically in \chat.ts\ and \gent.ts\ based on model catalog, and explicitly in \server-node-events.ts\.
- Persist \offloadedRefs\ into the transcript pipeline in \chat.ts\ to preserve media metadata for >2MB attachments.
- Correctly map \MediaOffloadError\ to 5xx (UNAVAILABLE) to differentiate server storage faults from 4xx client validation errors.
* fix(gateway): dynamically compute supportsImages for overrides and node events
Address follow-up Codex review feedback:
- Use effective model (including overrides) to compute \supportsImages\ in \gent.ts\.
- Move session load earlier in \server-node-events.ts\ to dynamically compute \supportsImages\ rather than hardcoding true.
* fix(gateway): resolve capability edge cases reported by codex
Address final Codex edge cases:
- Refactor \gent.ts\ to compute \supportsImages\ even when no session key is present, ensuring text-only override requests without sessions safely drop attachments.
- Update catalog lookups in \chat.ts\, \gent.ts\, and \server-node-events.ts\ to strictly match both \id\ and \provider\ to prevent cross-provider model collisions.
* fix(agents): restore before_install hook for skill installs
Restore the plugin scanner security hook that was accidentally dropped during merge conflict resolution.
* fix: resolve attachment pathing, defer parsing after auth gates, and clean up node-event mocks
* fix: resolve syntax errors in test-env, fix missing helper imports, and optimize parsing sequence in node events
* fix(gateway): re-enforce message length limit after attachment parsing
Adds a secondary check to ensure the 20,000-char cap remains effective even after media markers are appended during the offload flow.
* fix(gateway): prevent dropping valid small images and clean up orphaned media on size rejection
* fix(gateway): share attachment image capability checks
* fix(gateway): preserve mixed attachment order
* fix: fail closed on unknown image capability (#55513) (thanks @Syysean)
* fix: classify offloaded attachment refs explicitly (#55513) (thanks @Syysean)
---------
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
* fix(agents): dispose bundled MCP runtime after local runs
* fix(agents): scope bundle MCP cleanup to local one-shots
* fix(agents): dispose bundle MCP after local runs
* docs(changelog): note local bundle MCP cleanup fix
* fix: fetch OpenRouter model capabilities at runtime for unknown models
When an OpenRouter model is not in the built-in static snapshot from
pi-ai, the fallback hardcodes input: ["text"], silently dropping images.
Query the OpenRouter API at runtime to detect actual capabilities
(image support, reasoning, context window) for models not in the
built-in list. Results are cached in memory for 1 hour. On API
failure/timeout, falls back to text-only (no regression).
* feat(openrouter): add disk cache for OpenRouter model capabilities
Persist the OpenRouter model catalog to ~/.openclaw/cache/openrouter-models.json
so it survives process restarts. Cache lookup order:
1. In-memory Map (instant)
2. On-disk JSON file (avoids network on restart)
3. OpenRouter API fetch (populates both layers)
Also triggers a background refresh when a model is not found in the cache,
in case it was newly added to OpenRouter.
* refactor(openrouter): remove pre-warm, use pure lazy-load with disk cache
- Remove eager ensureOpenRouterModelCache() from run.ts
- Remove TTL — model capabilities are stable, no periodic re-fetching
- Cache lookup: in-memory → disk → API fetch (only when needed)
- API is only called when no cache exists or a model is not found
- Disk cache persists across gateway restarts
* fix(openrouter): address review feedback
- Fix timer leak: move clearTimeout to finally block
- Fix modality check: only check input side of "->" separator to avoid
matching image-generation models (text->image)
- Use resolveStateDir() instead of hardcoded homedir()/.openclaw
- Separate cache dir and filename constants
- Add utf-8 encoding to writeFileSync for consistency
- Add data validation when reading disk cache
* ci: retrigger checks
* fix: preload unknown OpenRouter model capabilities before resolve
* fix: accept top-level OpenRouter max token metadata
* fix: update changelog for OpenRouter runtime capability lookup (#45824) (thanks @DJjjjhao)
* fix: avoid redundant OpenRouter refetches and preserve suppression guards
---------
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
* feat(context-engine): add ContextEngine interface and registry
Introduce the pluggable ContextEngine abstraction that allows external
plugins to register custom context management strategies.
- ContextEngine interface with lifecycle methods: bootstrap, ingest,
ingestBatch, afterTurn, assemble, compact, prepareSubagentSpawn,
onSubagentEnded, dispose
- Module-level singleton registry with registerContextEngine() and
resolveContextEngine() (config-driven slot selection)
- LegacyContextEngine: pass-through implementation wrapping existing
compaction behavior for 100% backward compatibility
- ensureContextEnginesInitialized() guard for safe one-time registration
- 19 tests covering contract, registry, resolution, and legacy parity
* feat(plugins): add context-engine slot and registerContextEngine API
Wire the ContextEngine abstraction into the plugin system so external
plugins can register context engines via the standard plugin API.
- Add 'context-engine' to PluginKind union type
- Add 'contextEngine' slot to PluginSlotsConfig (default: 'legacy')
- Wire registerContextEngine() through OpenClawPluginApi
- Export ContextEngine types from plugin-sdk for external consumers
- Restore proper slot-based resolution in registry
* feat(context-engine): wire ContextEngine into agent run lifecycle
Integrate the ContextEngine abstraction into the core agent run path:
- Resolve context engine once per run (reused across retries)
- Bootstrap: hydrate canonical store from session file on first run
- Assemble: route context assembly through pluggable engine
- Auto-compaction guard: disable built-in auto-compaction when
the engine declares ownsCompaction (prevents double-compaction)
- AfterTurn: post-turn lifecycle hook for ingest + background
compaction decisions
- Overflow compaction: route through contextEngine.compact()
- Dispose: clean up engine resources in finally block
- Notify context engine on subagent lifecycle events
Legacy engine: all lifecycle methods are pass-through/no-op, preserving
100% backward compatibility for users without a context engine plugin.
* feat(plugins): add scoped subagent methods and gateway request scope
Expose runtime.subagent.{run, waitForRun, getSession, deleteSession}
so external plugins can spawn sub-agent sessions without raw gateway
dispatch access.
Uses AsyncLocalStorage request-scope bridge to dispatch internally via
handleGatewayRequest with a synthetic operator client. Methods are only
available during gateway request handling.
- Symbol.for-backed global singleton for cross-module-reload safety
- Fallback gateway context for non-WS dispatch paths (Telegram/WhatsApp)
- Set gateway request scope for all handlers, not just plugin handlers
- 3 staleness tests for fallback context hardening
* feat(context-engine): route /compact and sessions.get through context engine
Wire the /compact command and sessions.get handler through the pluggable
ContextEngine interface.
- Thread tokenBudget and force parameters to context engine compact
- Route /compact through contextEngine.compact() when registered
- Wire sessions.get as runtime alias for plugin subagent dispatch
- Add .pebbles/ to .gitignore
* style: format with oxfmt 0.33.0
Fix duplicate import (ControlUiRootState in server.impl.ts) and
import ordering across all changed files.
* fix: update extension test mocks for context-engine types
Add missing subagent property to bluebubbles PluginRuntime mock.
Add missing registerContextEngine to lobster OpenClawPluginApi mock.
* fix(subagents): keep deferred delete cleanup retryable
* style: format run attempt for CI
* fix(rebase): remove duplicate embedded-run imports
* test: add missing gateway context mock export
* fix: pass resolved auth profile into afterTurn compaction
Ensure the embedded runner forwards resolved auth profile context into
legacy context-engine compaction params on the normal afterTurn path,
matching overflow compaction behavior. This allows downstream LCM
summarization to use the intended provider auth/profile consistently.
Also fix strict TS typing in external-link token dedupe and align an
attempt unit test reasoningLevel value with the current ReasoningLevel
enum.
Regeneration-Prompt: |
We were debugging context-engine compaction where downstream summary
calls were missing the right auth/profile context in normal afterTurn
flow, while overflow compaction already propagated it. Preserve current
behavior and keep changes additive: thread the resolved authProfileId
through run -> attempt -> legacy compaction param builder without
broad refactors.
Add tests that prove the auth profile is included in afterTurn legacy
params and that overflow compaction still passes it through run
attempts. Keep existing APIs stable, and only adjust small type issues
needed for strict compilation.
* fix: remove duplicate imports from rebase
* feat: add context-engine system prompt additions
* fix(rebase): dedupe attempt import declarations
* test: fix fetch mock typing in ollama autodiscovery
* fix(test): add registerContextEngine to diffs extension mock APIs
* test(windows): use path.delimiter in ios-team-id fixture PATH
* test(cron): add model formatting and precedence edge case tests
Covers:
- Provider/model string splitting (whitespace, nested paths, empty segments)
- Provider normalization (casing, aliases like bedrock→amazon-bedrock)
- Anthropic model alias normalization (opus-4.5→claude-opus-4-5)
- Precedence: job payload > session override > config default
- Sequential runs with different providers (CI flake regression pattern)
- forceNew session preserving stored model overrides
- Whitespace/empty model string edge cases
- Config model as string vs object format
* test(cron): fix model formatting test config types
* test(phone-control): add registerContextEngine to mock API
* fix: re-export ChannelKind from config-reload-plan
* fix: add subagent mock to plugin-runtime-mock test util
* docs: add changelog fragment for context engine PR #22201