Reworks the Codex app-server Guardian change into the final landing shape:
- keep YOLO as the default local app-server mode
- add explicit `appServer.mode: "guardian"`
- remove the legacy `OPENCLAW_CODEX_APP_SERVER_GUARDIAN` shortcut
- document Guardian configuration and behavior
- add Guardian event projection and Docker live probes for approved/ask-back decisions
Co-authored-by: pashpashpash <nik@vault77.ai>
* feat(amazon-bedrock-mantle): add Claude Opus 4.7 via Anthropic auth
* fix(amazon-bedrock-mantle): keep Opus 4.7 transport-safe
* fix(amazon-bedrock-mantle): restore anthropic base url helper
* fix(auto-reply): apply runtime auth to conversation labels
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(amazon-bedrock): add known model context windows to discovery
Bedrock's ListFoundationModels API does not expose token limits. Discovery
was hardcoding contextWindow: 32000 for every model, causing Claude (1M),
Nova (300K), and other models to hit premature 'Context limit exceeded'
errors and unnecessary session resets.
Adds a lookup table of known context windows for Bedrock models:
- Anthropic Claude: 200K-1M
- Amazon Nova: 128K-1M
- Meta Llama: 128K
- Mistral: 32K-128K
- DeepSeek: 128K
- Cohere: 128K
- AI21 Jamba: 256K
Inference profile prefixes (us., eu., ap., global.) are stripped before
lookup, so us.anthropic.claude-opus-4-6-v1 correctly resolves to 1M.
Also raises the default fallback from 32K to 128K for unknown models —
most modern models have at least 128K context.
Single file change, no type system modifications.
Complementary to #65030 (provenance flag for warning on unknown models).
Fixes#64919
Related: #64250
* add KNOWN_MAX_TOKENS map and expand model coverage
- Add KNOWN_MAX_TOKENS lookup table with Bedrock-optimized values that
balance response quality against quota burndown (5x rate for Claude 3.7+)
- Add missing models to KNOWN_CONTEXT_WINDOWS: Opus 4.7 (1M), Opus 4.1/4.5,
Sonnet 4, Claude 3/3.5 Haiku, DeepSeek V3/V3.2, Google Gemma 3
- Refactor prefix-stripping into shared resolveKnownValue() helper
- Fix: use !== undefined instead of truthy check for table lookups
- Wire resolveKnownMaxTokens into toModelDefinition and resolveInferenceProfiles
Quota burndown context: Bedrock reserves input_tokens + max_tokens from
TPM at request start. For Claude 3.7+, output burns at 5x. The values
in KNOWN_MAX_TOKENS are intentionally conservative (8-16K for Claude)
to maximize concurrent throughput while still allowing useful responses.
Thinking budget is added separately by the runtime.
* remove KNOWN_MAX_TOKENS — maxTokens should be handled upstream
Remove the KNOWN_MAX_TOKENS map. Hardcoding maxTokens values in
discovery is the wrong layer to solve this — any explicit value
still gets reserved against Bedrock's TPM quota at request start.
The correct fix is upstream in pi's Bedrock provider: omit maxTokens
from inferenceConfig when not explicitly set, letting the model use
its internal default. This avoids quota waste entirely.
See: badlogic/pi-mono#3399 and badlogic/pi-mono#3400
Keep the expanded KNOWN_CONTEXT_WINDOWS (context windows ARE the
right thing to set in discovery — they affect compaction thresholds
and session management, not API-level quota reservation).
* docs: clarify why hardcoded context windows are needed
Bedrock's ListFoundationModels and GetFoundationModel APIs return no
token limit information — there is no Bedrock API to discover context
windows or max output tokens programmatically. Note that this table
should become a fallback if AWS adds token metadata in the future.
* fix: add au and apac to inference profile prefix regex
Add missing geo prefixes discovered by querying inference profiles
across multiple regions:
- au. (Australia/NZ, used in ap-southeast-2/4/6)
- apac. (Asia-Pacific, used for older models in ap-northeast-1)
Both resolveKnownContextWindow and resolveBaseModelId now handle
all known prefixes: us, eu, ap, apac, au, jp, global.
* test: port au. prefix test from #65449 by @alickgithub2, add apac. coverage
Port the Australia/NZ inference profile test from PR #65449
(credit: @alickgithub2) and extend it to also cover the apac.
prefix discovered in ap-northeast-1.
* expand model coverage: Llama 4, MiniMax, NVIDIA, Mistral 3, GLM, Qwen
Cross-referenced KNOWN_CONTEXT_WINDOWS against live
list-foundation-models API. Added missing models:
- Llama 4 Maverick (1M) and Scout (512K)
- MiniMax M2/M2.1/M2.5 (1M)
- NVIDIA Nemotron Super/Nano variants (128K)
- Mistral Large 3 675B (128K)
- GLM 4.7/4.7-flash/5 (128K)
- Qwen3 Coder/32B/VL (128-256K)
Removed deprecated deepseek.v3-v1:0 and claude-opus-4-20250514
(not in active foundation models list).
* raise default context window from 128K to 200K
200K matches the floor for all current Claude models (the most
popular on Bedrock). Every other active model with a lower actual
limit is already in the explicit table. This ensures new Claude
models get a correct default without requiring a table update.
* test: update discovery test expectations for known context window values
* test: fix remaining contextWindow expectation (default 200K)
* fix(amazon-bedrock): keep conservative context fallback
* docs(changelog): note Bedrock context window fix
* fix(amazon-bedrock): normalize known context fallback
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(amazon-bedrock-mantle): refresh IAM bearer token via resolveConfigApiKey cache lookup
The Mantle plugin generates a bearer token from IAM credentials at discovery
time and bakes it as a static string into the provider config. After the
token's cache TTL expires (~1hr), requests fail because resolveConfigApiKey
only handled the explicit AWS_BEARER_TOKEN_BEDROCK env var case.
Fix: expose getCachedIamToken() as a sync read from the existing iamTokenCache,
and wire it into resolveConfigApiKey as a fallback when no explicit env var is
set. The catalog.run still generates/refreshes the token on discovery; this
change ensures the cached token is served at auth resolution time.
Fixes#68900
* fix(amazon-bedrock-mantle): refresh runtime IAM bearer auth
* docs(changelog): note Mantle IAM refresh
* fix(agents): apply runtime auth in simple completion
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>