* feat(amazon-bedrock-mantle): add Claude Opus 4.7 via Anthropic auth
* fix(amazon-bedrock-mantle): keep Opus 4.7 transport-safe
* fix(amazon-bedrock-mantle): restore anthropic base url helper
* fix(auto-reply): apply runtime auth to conversation labels
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(amazon-bedrock): add known model context windows to discovery
Bedrock's ListFoundationModels API does not expose token limits. Discovery
was hardcoding contextWindow: 32000 for every model, causing Claude (1M),
Nova (300K), and other models to hit premature 'Context limit exceeded'
errors and unnecessary session resets.
Adds a lookup table of known context windows for Bedrock models:
- Anthropic Claude: 200K-1M
- Amazon Nova: 128K-1M
- Meta Llama: 128K
- Mistral: 32K-128K
- DeepSeek: 128K
- Cohere: 128K
- AI21 Jamba: 256K
Inference profile prefixes (us., eu., ap., global.) are stripped before
lookup, so us.anthropic.claude-opus-4-6-v1 correctly resolves to 1M.
Also raises the default fallback from 32K to 128K for unknown models —
most modern models have at least 128K context.
Single file change, no type system modifications.
Complementary to #65030 (provenance flag for warning on unknown models).
Fixes#64919
Related: #64250
* add KNOWN_MAX_TOKENS map and expand model coverage
- Add KNOWN_MAX_TOKENS lookup table with Bedrock-optimized values that
balance response quality against quota burndown (5x rate for Claude 3.7+)
- Add missing models to KNOWN_CONTEXT_WINDOWS: Opus 4.7 (1M), Opus 4.1/4.5,
Sonnet 4, Claude 3/3.5 Haiku, DeepSeek V3/V3.2, Google Gemma 3
- Refactor prefix-stripping into shared resolveKnownValue() helper
- Fix: use !== undefined instead of truthy check for table lookups
- Wire resolveKnownMaxTokens into toModelDefinition and resolveInferenceProfiles
Quota burndown context: Bedrock reserves input_tokens + max_tokens from
TPM at request start. For Claude 3.7+, output burns at 5x. The values
in KNOWN_MAX_TOKENS are intentionally conservative (8-16K for Claude)
to maximize concurrent throughput while still allowing useful responses.
Thinking budget is added separately by the runtime.
* remove KNOWN_MAX_TOKENS — maxTokens should be handled upstream
Remove the KNOWN_MAX_TOKENS map. Hardcoding maxTokens values in
discovery is the wrong layer to solve this — any explicit value
still gets reserved against Bedrock's TPM quota at request start.
The correct fix is upstream in pi's Bedrock provider: omit maxTokens
from inferenceConfig when not explicitly set, letting the model use
its internal default. This avoids quota waste entirely.
See: badlogic/pi-mono#3399 and badlogic/pi-mono#3400
Keep the expanded KNOWN_CONTEXT_WINDOWS (context windows ARE the
right thing to set in discovery — they affect compaction thresholds
and session management, not API-level quota reservation).
* docs: clarify why hardcoded context windows are needed
Bedrock's ListFoundationModels and GetFoundationModel APIs return no
token limit information — there is no Bedrock API to discover context
windows or max output tokens programmatically. Note that this table
should become a fallback if AWS adds token metadata in the future.
* fix: add au and apac to inference profile prefix regex
Add missing geo prefixes discovered by querying inference profiles
across multiple regions:
- au. (Australia/NZ, used in ap-southeast-2/4/6)
- apac. (Asia-Pacific, used for older models in ap-northeast-1)
Both resolveKnownContextWindow and resolveBaseModelId now handle
all known prefixes: us, eu, ap, apac, au, jp, global.
* test: port au. prefix test from #65449 by @alickgithub2, add apac. coverage
Port the Australia/NZ inference profile test from PR #65449
(credit: @alickgithub2) and extend it to also cover the apac.
prefix discovered in ap-northeast-1.
* expand model coverage: Llama 4, MiniMax, NVIDIA, Mistral 3, GLM, Qwen
Cross-referenced KNOWN_CONTEXT_WINDOWS against live
list-foundation-models API. Added missing models:
- Llama 4 Maverick (1M) and Scout (512K)
- MiniMax M2/M2.1/M2.5 (1M)
- NVIDIA Nemotron Super/Nano variants (128K)
- Mistral Large 3 675B (128K)
- GLM 4.7/4.7-flash/5 (128K)
- Qwen3 Coder/32B/VL (128-256K)
Removed deprecated deepseek.v3-v1:0 and claude-opus-4-20250514
(not in active foundation models list).
* raise default context window from 128K to 200K
200K matches the floor for all current Claude models (the most
popular on Bedrock). Every other active model with a lower actual
limit is already in the explicit table. This ensures new Claude
models get a correct default without requiring a table update.
* test: update discovery test expectations for known context window values
* test: fix remaining contextWindow expectation (default 200K)
* fix(amazon-bedrock): keep conservative context fallback
* docs(changelog): note Bedrock context window fix
* fix(amazon-bedrock): normalize known context fallback
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(amazon-bedrock-mantle): refresh IAM bearer token via resolveConfigApiKey cache lookup
The Mantle plugin generates a bearer token from IAM credentials at discovery
time and bakes it as a static string into the provider config. After the
token's cache TTL expires (~1hr), requests fail because resolveConfigApiKey
only handled the explicit AWS_BEARER_TOKEN_BEDROCK env var case.
Fix: expose getCachedIamToken() as a sync read from the existing iamTokenCache,
and wire it into resolveConfigApiKey as a fallback when no explicit env var is
set. The catalog.run still generates/refreshes the token on discovery; this
change ensures the cached token is served at auth resolution time.
Fixes#68900
* fix(amazon-bedrock-mantle): refresh runtime IAM bearer auth
* docs(changelog): note Mantle IAM refresh
* fix(agents): apply runtime auth in simple completion
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(openshell): pin sandbox file reads against parent symlink swaps
* docs(changelog): note openshell sandbox read pinning (#69798)
* fix(openshell): containment-check against literal root and self-contain file-identity helper
* test(openshell): spy on fsPromises.open for swap races, skip dev=0 test on win32
* fix(openshell): single-syscall fallback identity check + tighten sameFileIdentity types
* fix(openshell): re-fstat pinned handle after identity check for defense-in-depth
* fix(openshell): lstat leaf on platforms without O_NOFOLLOW to close windows symlink gap
* fix(openshell): expose test seam for O_NOFOLLOW availability instead of patching native constants
## Summary
- add browser-local operator identity in Control UI and route user name/avatar rendering through the shared chat/avatar path used by assistant and agent surfaces
- tighten Quick Settings, fallback chip, and mobile chat layout behavior so the personalized UI uses space better and avoids clipped controls
- guard oversized local avatar uploads before FileReader allocation, restore the fallback-chip keyboard focus ring, and add the changelog note for the user-visible Control UI work
## Testing
- pnpm test ui/src/ui/views/config-quick.test.ts ui/src/styles/components.test.ts
- pnpm check:changed
* fix: propagate AWS SDK auth sentinel for IMDS/instance role Bedrock auth
When Bedrock auth resolves via AWS SDK default credential chain (IMDS,
ECS task role) with no explicit API key, the auth controller returned
early without calling setRuntimeApiKey(). This left pi's authStorage
unaware that the provider is authenticated, causing 'No API key found
for amazon-bedrock' errors.
Now, when mode is 'aws-sdk' and no explicit API key is available:
1. Try prepareProviderRuntimeAuth to resolve runtime credentials
2. If that returns a real apiKey, use it with auth refresh scheduling
3. Otherwise inject a '__aws_sdk_auth__' sentinel so pi's
hasConfiguredAuth() passes and the AWS SDK handles request signing
This is a focused fix in auth-controller.ts only, avoiding the risky
model-auth-runtime-shared.ts changes that could re-introduce the
fake-apiKey injection pattern on ECS (see prior regressions #49891,
#50699, #54274).
Fixes#62995
* fix(pi-auth): clean up aws-sdk sentinel fallback
* docs(changelog): note aws-sdk Bedrock auth fix
---------
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>