Commit Graph

398 Commits

Author SHA1 Message Date
Peter Steinberger
39cc6b7dc7 fix: stabilize character eval and Qwen model routing 2026-04-09 01:04:09 +01:00
Eric Curtin
0de5db8772 docs(inferrs): fix Gemma model id from gg-hf-gg to google (#62586) 2026-04-08 10:15:07 -04:00
Vincent Koc
3e7e6f2f60 docs: cover 2026.4.7 changelog gaps 2026-04-08 07:26:56 +01:00
Serg
b2456e8037 fix(zai): default to GLM-5.1 instead of GLM-5 2026-04-08 04:38:39 +01:00
Bruce MacDonald
86f35a9bc0 chore(ollama): update suggested onboarding models (#62626)
Merged via squash.

Prepared head SHA: 48c083b88a
Co-authored-by: BruceMacD <5853428+BruceMacD@users.noreply.github.com>
Co-authored-by: BruceMacD <5853428+BruceMacD@users.noreply.github.com>
Reviewed-by: @BruceMacD
2026-04-07 11:42:29 -07:00
Peter Steinberger
9d4b0d551d fix: support inferrs string-only completions 2026-04-07 15:55:20 +01:00
nv-kasikritc
d43cc470c6 refactor(nvidia-endpoints): updated language & default models (#59866)
* fix(nvidia-endpoints): updated language & default models

* fix(nvidia-endpoints): updated link for api key

* fix(nvidia-endpoints): removed unused const

* fix(nvidia-endpoints): edited max tokens

* fix(nvidia-endpoints): fixed typo

---------

Co-authored-by: Devin Robison <drobison00@users.noreply.github.com>
2026-04-07 08:47:29 -06:00
Peter Steinberger
c2f9de3935 feat: unify live cli backend probes 2026-04-07 10:35:24 +01:00
Neerav Makwana
b9179ee4b6 Docs: match Greptile wording for magistral-* line
Made-with: Cursor
2026-04-07 12:52:47 +05:30
Neerav Makwana
68bfc6fcf5 Mistral: enable reasoning_effort for mistral-small-latest
Made-with: Cursor
2026-04-07 12:52:47 +05:30
Peter Steinberger
bc18e69fbf fix: separate arcee auth envs from openrouter 2026-04-06 19:53:27 +01:00
arthurbr11
95106be59b feat: enhance Arcee AI provider with OpenRouter support and update onboarding instructions 2026-04-06 19:53:27 +01:00
arthurbr11
5ac2f58c57 feat: add Arcee AI provider plugin
Add a bundled Arcee AI provider plugin with ARCEEAI_API_KEY onboarding,
Trinity model catalog (mini, large-preview, large-thinking), and
OpenAI-compatible API support.

- Trinity Large Thinking: 256K context, reasoning enabled
- Trinity Large Preview: 128K context, general-purpose
- Trinity Mini 26B: 128K context, fast and cost-efficient
2026-04-06 19:53:27 +01:00
Peter Steinberger
f9c721d5bf fix: add vydra kling live lane 2026-04-06 19:47:43 +01:00
Vincent Koc
e7fe087677 fix(openai): normalize prompt overlay personality config 2026-04-06 17:24:51 +01:00
Peter Steinberger
0c5e6037b0 fix(openai): clarify auth routes in picker and docs 2026-04-06 16:14:51 +01:00
Peter Steinberger
ac38f332c5 fix(anthropic): prefer claude cli over setup-token 2026-04-06 15:31:07 +01:00
Peter Steinberger
d378a504ac fix: restore claude cli guidance and doctor behavior 2026-04-06 14:21:11 +01:00
Peter Steinberger
c39f061003 Revert "refactor(cli): remove bundled cli text providers"
This reverts commit 05d351c430.
2026-04-06 13:40:41 +01:00
Peter Steinberger
9b2b22f350 feat: add vydra media provider 2026-04-06 02:21:51 +01:00
Peter Steinberger
a9f491310c fix: route comfy music through shared tool 2026-04-06 02:03:13 +01:00
Peter Steinberger
f6dbcf4cda docs: document music generation async flow 2026-04-06 01:49:58 +01:00
Peter Steinberger
aeb9ad52fa feat: add comfy workflow media support 2026-04-06 01:45:01 +01:00
wirjo
0793136c63 feat(bedrock-mantle): add IAM credential auth via @aws/bedrock-token-… (#61563)
* feat(bedrock-mantle): add IAM credential auth via @aws/bedrock-token-generator

Mantle previously required a manually-created API key (AWS_BEARER_TOKEN_BEDROCK).
This adds automatic bearer token generation from IAM credentials using the
official @aws/bedrock-token-generator package.

Auth priority:
1. Explicit AWS_BEARER_TOKEN_BEDROCK env var (manual API key from Console)
2. IAM credentials via getTokenProvider() → Bearer token (instance roles,
   SSO profiles, access keys, EKS IRSA, ECS task roles)

Token is cached in memory (1hr TTL, generated with 2hr validity) and in
process.env.AWS_BEARER_TOKEN_BEDROCK for downstream sync reads.

Falls back gracefully when package is not installed or credentials are
unavailable — Mantle provider simply not registered.

Closes #45152

* fix(bedrock-mantle): harden IAM auth

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-04-06 01:41:24 +01:00
Peter Steinberger
9b00008561 docs(openai): clarify gpt-5.4 fast mode 2026-04-06 01:20:52 +01:00
wirjo
699b2320a8 feat(memory): add Bedrock embedding provider for memory search (#61547)
* feat(memory): add Bedrock embedding provider for memory search

Add Amazon Bedrock as a native embedding provider for memory search.
Supports Titan Embed Text v1/v2 and Cohere Embed models via AWS SDK.

- New embeddings-bedrock.ts: BedrockRuntimeClient + InvokeModel
- Auth via AWS default credential chain (same as Bedrock inference)
- Auto-selected in 'auto' mode when AWS credentials are detected
- Titan V2: configurable dimensions (256/512/1024), normalization
- Cohere: native batch support with search_query/search_document types
- 16 new tests covering all model types, auth detection, edge cases

Closes #26289

* fix(memory): harden bedrock embedding selection

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-04-06 01:19:56 +01:00
Vincent Koc
f4cd1a3782 docs: rewrite video generation docs for readability 2026-04-06 01:19:44 +01:00
Peter Steinberger
4bb965e007 docs(providers): surface new video provider pages 2026-04-06 01:02:59 +01:00
Peter Steinberger
ad6c584ce7 fix: ignore unsupported video generation overrides 2026-04-06 01:02:10 +01:00
Peter Steinberger
379bc1c032 docs(video): document runway support 2026-04-06 00:50:32 +01:00
Peter Steinberger
6d34a1c814 fix(video): queue fal provider jobs 2026-04-06 00:12:47 +01:00
Peter Steinberger
c6d3ee70e2 docs(providers): unify qwen docs 2026-04-05 23:23:58 +01:00
Peter Steinberger
a234157337 docs(providers): link generation guides 2026-04-05 23:21:14 +01:00
Peter Steinberger
f30c087fdf docs(providers): add generation setup pages 2026-04-05 23:21:14 +01:00
Peter Steinberger
05d351c430 refactor(cli): remove bundled cli text providers 2026-04-05 18:46:36 +01:00
Peter Steinberger
5790435975 feat(agents): add video_generate tool 2026-04-05 18:44:06 +01:00
Peter Steinberger
fe93f29486 docs(anthropic): clarify api key and doctor recovery 2026-04-05 18:05:12 +01:00
Peter Steinberger
7075da59bd feat: allow occasional emoji in friendly openai overlay 2026-04-05 16:56:25 +01:00
Peter Steinberger
d25609bc06 fix: default OpenAI personality overlay to friendly 2026-04-05 16:15:08 +01:00
Peter Steinberger
dfd39a81d8 feat(openai): add opt-in GPT personality 2026-04-05 15:25:06 +01:00
Vincent Koc
fc9648b620 docs: add Bedrock inference profiles and Bedrock Mantle provider coverage, re-sort changelog 2026-04-05 13:04:47 +01:00
Vincent Koc
852e8f7a2a docs: update Claude CLI backend docs for MCP bridge, streaming, and auth changes 2026-04-05 10:54:11 +01:00
Peter Steinberger
84fb62170a docs: clarify anthropic cli fallback guidance 2026-04-05 10:06:32 +01:00
Peter Steinberger
19de5d1b56 refactor: move provider discovery config into plugins 2026-04-05 09:55:55 +01:00
Peter Steinberger
d655a8bc76 feat: add Fireworks provider and simplify plugin setup loading 2026-04-05 07:43:14 +01:00
Peter Steinberger
37301cbc3b docs: clarify anthropic extra usage billing 2026-04-05 07:14:35 +09:00
Peter Steinberger
eee868452f docs: refresh claude-cli model ref mirrors 2026-04-04 22:19:07 +01:00
Peter Steinberger
6de100d4e2 docs: refresh claude-cli naming mirrors 2026-04-04 22:11:45 +01:00
Peter Steinberger
8ea5b1ddc0 docs: refresh anthropic token compatibility mirrors 2026-04-04 22:09:21 +01:00
Peter Steinberger
5c5c82dfaa docs: refresh anthropic oauth defaults refs 2026-04-04 22:01:16 +01:00