mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 13:50:49 +00:00
docs(concepts): compress model-providers reference — move SDK hook list to provider plugins doc, tabulate bundled providers
This commit is contained in:
@@ -3,7 +3,7 @@ summary: "Model provider overview with example configs + CLI flows"
|
||||
read_when:
|
||||
- You need a provider-by-provider model setup reference
|
||||
- You want example configs or CLI onboarding commands for model providers
|
||||
title: "Model Providers"
|
||||
title: "Model providers"
|
||||
---
|
||||
|
||||
# Model providers
|
||||
@@ -14,218 +14,21 @@ For model selection rules, see [/concepts/models](/concepts/models).
|
||||
## Quick rules
|
||||
|
||||
- Model refs use `provider/model` (example: `opencode/claude-opus-4-6`).
|
||||
- If you set `agents.defaults.models`, it becomes the allowlist.
|
||||
- `agents.defaults.models` acts as an allowlist when set.
|
||||
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
|
||||
- Fallback runtime rules, cooldown probes, and session-override persistence are
|
||||
documented in [/concepts/model-failover](/concepts/model-failover).
|
||||
- `models.providers.*.models[].contextWindow` is native model metadata;
|
||||
`models.providers.*.models[].contextTokens` is the effective runtime cap.
|
||||
- Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
|
||||
OpenClaw merges that output into `models.providers` before writing
|
||||
`models.json`.
|
||||
- Provider manifests can declare `providerAuthEnvVars` and
|
||||
`providerAuthAliases` so generic env-based auth probes and provider variants
|
||||
do not need to load plugin runtime. The remaining core env-var map is now
|
||||
just for non-plugin/core providers and a few generic-precedence cases such
|
||||
as Anthropic API-key-first onboarding.
|
||||
- Provider plugins can also own provider runtime behavior via
|
||||
`normalizeModelId`, `normalizeTransport`, `normalizeConfig`,
|
||||
`applyNativeStreamingUsageCompat`, `resolveConfigApiKey`,
|
||||
`resolveSyntheticAuth`, `shouldDeferSyntheticProfileAuth`,
|
||||
`resolveDynamicModel`, `prepareDynamicModel`,
|
||||
`normalizeResolvedModel`, `contributeResolvedModelCompat`,
|
||||
`capabilities`, `normalizeToolSchemas`,
|
||||
`inspectToolSchemas`, `resolveReasoningOutputMode`,
|
||||
`prepareExtraParams`, `createStreamFn`, `wrapStreamFn`,
|
||||
`resolveTransportTurnState`, `resolveWebSocketSessionPolicy`,
|
||||
`createEmbeddingProvider`, `formatApiKey`, `refreshOAuth`,
|
||||
`buildAuthDoctorHint`,
|
||||
`matchesContextOverflowError`, `classifyFailoverReason`,
|
||||
`isCacheTtlEligible`, `buildMissingAuthMessage`, `suppressBuiltInModel`,
|
||||
`augmentModelCatalog`, `resolveThinkingProfile`, `isBinaryThinking`,
|
||||
`supportsXHighThinking`, `resolveDefaultThinkingLevel`,
|
||||
`applyConfigDefaults`, `isModernModelRef`,
|
||||
`prepareRuntimeAuth`, `resolveUsageAuth`, `fetchUsageSnapshot`, and
|
||||
`onModelSelected`.
|
||||
- Note: provider runtime `capabilities` is shared runner metadata (provider
|
||||
family, transcript/tooling quirks, transport/cache hints). It is not the
|
||||
same as the [public capability model](/plugins/architecture#public-capability-model)
|
||||
which describes what a plugin registers (text inference, speech, etc.).
|
||||
- The bundled `codex` provider is paired with the bundled Codex agent harness.
|
||||
Use `codex/gpt-*` when you want Codex-owned login, model discovery, native
|
||||
thread resume, and app-server execution. Plain `openai/gpt-*` refs continue
|
||||
to use the OpenAI provider and the normal OpenClaw provider transport.
|
||||
Codex-only deployments can disable automatic PI fallback with
|
||||
`agents.defaults.embeddedHarness.fallback: "none"`; see
|
||||
[Codex Harness](/plugins/codex-harness).
|
||||
- `models.providers.*.models[].contextWindow` is native model metadata; `contextTokens` is the effective runtime cap.
|
||||
- Fallback rules, cooldown probes, and session-override persistence: [Model failover](/concepts/model-failover).
|
||||
- Bundled `codex` is paired with the Codex agent harness — use `codex/gpt-*` for Codex-owned login, discovery, native thread resume, and app-server execution. Plain `openai/gpt-*` uses the OpenAI provider and normal transport. Disable automatic PI fallback for Codex-only deployments via `agents.defaults.embeddedHarness.fallback: "none"` — see [Codex harness](/plugins/codex-harness).
|
||||
|
||||
## Plugin-owned provider behavior
|
||||
|
||||
Provider plugins can now own most provider-specific logic while OpenClaw keeps
|
||||
the generic inference loop.
|
||||
Most provider-specific logic lives in provider plugins (`registerProvider(...)`) while OpenClaw keeps the generic inference loop. Plugins own onboarding, model catalogs, auth env-var mapping, transport/config normalization, tool-schema cleanup, failover classification, OAuth refresh, usage reporting, thinking/reasoning profiles, and more.
|
||||
|
||||
Typical split:
|
||||
The full list of provider-SDK hooks and bundled-plugin examples lives in [Provider plugins](/plugins/sdk-provider-plugins). A provider that needs a totally custom request executor is a separate, deeper extension surface.
|
||||
|
||||
- `auth[].run` / `auth[].runNonInteractive`: provider owns onboarding/login
|
||||
flows for `openclaw onboard`, `openclaw models auth`, and headless setup
|
||||
- `wizard.setup` / `wizard.modelPicker`: provider owns auth-choice labels,
|
||||
legacy aliases, onboarding allowlist hints, and setup entries in onboarding/model pickers
|
||||
- `catalog`: provider appears in `models.providers`
|
||||
- `normalizeModelId`: provider normalizes legacy/preview model ids before
|
||||
lookup or canonicalization
|
||||
- `normalizeTransport`: provider normalizes transport-family `api` / `baseUrl`
|
||||
before generic model assembly; OpenClaw checks the matched provider first,
|
||||
then other hook-capable provider plugins until one actually changes the
|
||||
transport
|
||||
- `normalizeConfig`: provider normalizes `models.providers.<id>` config before
|
||||
runtime uses it; OpenClaw checks the matched provider first, then other
|
||||
hook-capable provider plugins until one actually changes the config. If no
|
||||
provider hook rewrites the config, bundled Google-family helpers still
|
||||
normalize supported Google provider entries.
|
||||
- `applyNativeStreamingUsageCompat`: provider applies endpoint-driven native streaming-usage compat rewrites for config providers
|
||||
- `resolveConfigApiKey`: provider resolves env-marker auth for config providers
|
||||
without forcing full runtime auth loading. `amazon-bedrock` also has a
|
||||
built-in AWS env-marker resolver here, even though Bedrock runtime auth uses
|
||||
the AWS SDK default chain.
|
||||
- `resolveSyntheticAuth`: provider can expose local/self-hosted or other
|
||||
config-backed auth availability without persisting plaintext secrets
|
||||
- `shouldDeferSyntheticProfileAuth`: provider can mark stored synthetic profile
|
||||
placeholders as lower precedence than env/config-backed auth
|
||||
- `resolveDynamicModel`: provider accepts model ids not present in the local
|
||||
static catalog yet
|
||||
- `prepareDynamicModel`: provider needs a metadata refresh before retrying
|
||||
dynamic resolution
|
||||
- `normalizeResolvedModel`: provider needs transport or base URL rewrites
|
||||
- `contributeResolvedModelCompat`: provider contributes compat flags for its
|
||||
vendor models even when they arrive through another compatible transport
|
||||
- `capabilities`: provider publishes transcript/tooling/provider-family quirks
|
||||
- `normalizeToolSchemas`: provider cleans tool schemas before the embedded
|
||||
runner sees them
|
||||
- `inspectToolSchemas`: provider surfaces transport-specific schema warnings
|
||||
after normalization
|
||||
- `resolveReasoningOutputMode`: provider chooses native vs tagged
|
||||
reasoning-output contracts
|
||||
- `prepareExtraParams`: provider defaults or normalizes per-model request params
|
||||
- `createStreamFn`: provider replaces the normal stream path with a fully
|
||||
custom transport
|
||||
- `wrapStreamFn`: provider applies request headers/body/model compat wrappers
|
||||
- `resolveTransportTurnState`: provider supplies per-turn native transport
|
||||
headers or metadata
|
||||
- `resolveWebSocketSessionPolicy`: provider supplies native WebSocket session
|
||||
headers or session cool-down policy
|
||||
- `createEmbeddingProvider`: provider owns memory embedding behavior when it
|
||||
belongs with the provider plugin instead of the core embedding switchboard
|
||||
- `formatApiKey`: provider formats stored auth profiles into the runtime
|
||||
`apiKey` string expected by the transport
|
||||
- `refreshOAuth`: provider owns OAuth refresh when the shared `pi-ai`
|
||||
refreshers are not enough
|
||||
- `buildAuthDoctorHint`: provider appends repair guidance when OAuth refresh
|
||||
fails
|
||||
- `matchesContextOverflowError`: provider recognizes provider-specific
|
||||
context-window overflow errors that generic heuristics would miss
|
||||
- `classifyFailoverReason`: provider maps provider-specific raw transport/API
|
||||
errors to failover reasons such as rate limit or overload
|
||||
- `isCacheTtlEligible`: provider decides which upstream model ids support prompt-cache TTL
|
||||
- `buildMissingAuthMessage`: provider replaces the generic auth-store error
|
||||
with a provider-specific recovery hint
|
||||
- `suppressBuiltInModel`: provider hides stale upstream rows and can return a
|
||||
vendor-owned error for direct resolution failures
|
||||
- `augmentModelCatalog`: provider appends synthetic/final catalog rows after
|
||||
discovery and config merging
|
||||
- `resolveThinkingProfile`: provider owns the exact `/think` level set,
|
||||
optional display labels, and default level for a selected model
|
||||
- `isBinaryThinking`: compatibility hook for binary on/off thinking UX
|
||||
- `supportsXHighThinking`: compatibility hook for selected `xhigh` models
|
||||
- `resolveDefaultThinkingLevel`: compatibility hook for default `/think` policy
|
||||
- `applyConfigDefaults`: provider applies provider-specific global defaults
|
||||
during config materialization based on auth mode, env, or model family
|
||||
- `isModernModelRef`: provider owns live/smoke preferred-model matching
|
||||
- `prepareRuntimeAuth`: provider turns a configured credential into a short
|
||||
lived runtime token
|
||||
- `resolveUsageAuth`: provider resolves usage/quota credentials for `/usage`
|
||||
and related status/reporting surfaces
|
||||
- `fetchUsageSnapshot`: provider owns the usage endpoint fetch/parsing while
|
||||
core still owns the summary shell and formatting
|
||||
- `onModelSelected`: provider runs post-selection side effects such as
|
||||
telemetry or provider-owned session bookkeeping
|
||||
|
||||
Current bundled examples:
|
||||
|
||||
- `anthropic`: Claude 4.6 forward-compat fallback, auth repair hints, usage
|
||||
endpoint fetching, cache-TTL/provider-family metadata, and auth-aware global
|
||||
config defaults
|
||||
- `amazon-bedrock`: provider-owned context-overflow matching and failover
|
||||
reason classification for Bedrock-specific throttle/not-ready errors, plus
|
||||
the shared `anthropic-by-model` replay family for Claude-only replay-policy
|
||||
guards on Anthropic traffic
|
||||
- `anthropic-vertex`: Claude-only replay-policy guards on Anthropic-message
|
||||
traffic
|
||||
- `openrouter`: pass-through model ids, request wrappers, provider capability
|
||||
hints, Gemini thought-signature sanitation on proxy Gemini traffic, proxy
|
||||
reasoning injection through the `openrouter-thinking` stream family, routing
|
||||
metadata forwarding, and cache-TTL policy
|
||||
- `github-copilot`: onboarding/device login, forward-compat model fallback,
|
||||
Claude-thinking transcript hints, runtime token exchange, and usage endpoint
|
||||
fetching
|
||||
- `openai`: GPT-5.4 forward-compat fallback, direct OpenAI transport
|
||||
normalization, Codex-aware missing-auth hints, Spark suppression, synthetic
|
||||
OpenAI/Codex catalog rows, thinking/live-model policy, usage-token alias
|
||||
normalization (`input` / `output` and `prompt` / `completion` families), the
|
||||
shared `openai-responses-defaults` stream family for native OpenAI/Codex
|
||||
wrappers, provider-family metadata, bundled image-generation provider
|
||||
registration for `gpt-image-2`, and bundled video-generation provider
|
||||
registration for `sora-2`
|
||||
- `google` and `google-gemini-cli`: Gemini 3.1 forward-compat fallback,
|
||||
native Gemini replay validation, bootstrap replay sanitation, tagged
|
||||
reasoning-output mode, modern-model matching, bundled image-generation
|
||||
provider registration for Gemini image-preview models, and bundled
|
||||
video-generation provider registration for Veo models; Gemini CLI OAuth also
|
||||
owns auth-profile token formatting, usage-token parsing, and quota endpoint
|
||||
fetching for usage surfaces
|
||||
- `moonshot`: shared transport, plugin-owned thinking payload normalization
|
||||
- `kilocode`: shared transport, plugin-owned request headers, reasoning payload
|
||||
normalization, proxy-Gemini thought-signature sanitation, and cache-TTL
|
||||
policy
|
||||
- `zai`: GLM-5 forward-compat fallback, `tool_stream` defaults, cache-TTL
|
||||
policy, binary-thinking/live-model policy, and usage auth + quota fetching;
|
||||
unknown `glm-5*` ids synthesize from the bundled `glm-4.7` template
|
||||
- `xai`: native Responses transport normalization, `/fast` alias rewrites for
|
||||
Grok fast variants, default `tool_stream`, xAI-specific tool-schema /
|
||||
reasoning-payload cleanup, and bundled video-generation provider
|
||||
registration for `grok-imagine-video`
|
||||
- `mistral`: plugin-owned capability metadata
|
||||
- `opencode` and `opencode-go`: plugin-owned capability metadata plus
|
||||
proxy-Gemini thought-signature sanitation
|
||||
- `alibaba`: plugin-owned video-generation catalog for direct Wan model refs
|
||||
such as `alibaba/wan2.6-t2v`
|
||||
- `byteplus`: plugin-owned catalogs plus bundled video-generation provider
|
||||
registration for Seedance text-to-video/image-to-video models
|
||||
- `fal`: bundled video-generation provider registration for hosted third-party
|
||||
image-generation provider registration for FLUX image models plus bundled
|
||||
video-generation provider registration for hosted third-party video models
|
||||
- `cloudflare-ai-gateway`, `huggingface`, `kimi`, `nvidia`, `qianfan`,
|
||||
`stepfun`, `synthetic`, `venice`, `vercel-ai-gateway`, and `volcengine`:
|
||||
plugin-owned catalogs only
|
||||
- `qwen`: plugin-owned catalogs for text models plus shared
|
||||
media-understanding and video-generation provider registrations for its
|
||||
multimodal surfaces; Qwen video generation uses the Standard DashScope video
|
||||
endpoints with bundled Wan models such as `wan2.6-t2v` and `wan2.7-r2v`
|
||||
- `runway`: plugin-owned video-generation provider registration for native
|
||||
Runway task-based models such as `gen4.5`
|
||||
- `minimax`: plugin-owned catalogs, bundled video-generation provider
|
||||
registration for Hailuo video models, bundled image-generation provider
|
||||
registration for `image-01`, hybrid Anthropic/OpenAI replay-policy
|
||||
selection, and usage auth/snapshot logic
|
||||
- `together`: plugin-owned catalogs plus bundled video-generation provider
|
||||
registration for Wan video models
|
||||
- `xiaomi`: plugin-owned catalogs plus usage auth/snapshot logic
|
||||
|
||||
The bundled `openai` plugin now owns both provider ids: `openai` and
|
||||
`openai-codex`.
|
||||
|
||||
That covers providers that still fit OpenClaw's normal transports. A provider
|
||||
that needs a totally custom request executor is a separate, deeper extension
|
||||
surface.
|
||||
<Note>
|
||||
Provider runtime `capabilities` is shared runner metadata (provider family, transcript/tooling quirks, transport/cache hints). It is not the same as the [public capability model](/plugins/architecture#public-capability-model), which describes what a plugin registers (text inference, speech, etc.).
|
||||
</Note>
|
||||
|
||||
## API key rotation
|
||||
|
||||
@@ -411,68 +214,38 @@ See [/providers/kilocode](/providers/kilocode) for setup details.
|
||||
|
||||
### Other bundled provider plugins
|
||||
|
||||
- OpenRouter: `openrouter` (`OPENROUTER_API_KEY`)
|
||||
- Example models: `openrouter/auto`, `openrouter/moonshotai/kimi-k2.6`
|
||||
- OpenClaw applies OpenRouter's documented app-attribution headers only when
|
||||
the request actually targets `openrouter.ai`
|
||||
- OpenRouter-specific Anthropic `cache_control` markers are likewise gated to
|
||||
verified OpenRouter routes, not arbitrary proxy URLs
|
||||
- OpenRouter remains on the proxy-style OpenAI-compatible path, so native
|
||||
OpenAI-only request shaping (`serviceTier`, Responses `store`,
|
||||
prompt-cache hints, OpenAI reasoning-compat payloads) is not forwarded
|
||||
- Gemini-backed OpenRouter refs keep proxy-Gemini thought-signature sanitation
|
||||
only; native Gemini replay validation and bootstrap rewrites stay off
|
||||
- Kilo Gateway: `kilocode` (`KILOCODE_API_KEY`)
|
||||
- Example model: `kilocode/kilo/auto`
|
||||
- Gemini-backed Kilo refs keep the same proxy-Gemini thought-signature
|
||||
sanitation path; `kilocode/kilo/auto` and other proxy-reasoning-unsupported
|
||||
hints skip proxy reasoning injection
|
||||
- MiniMax: `minimax` (API key) and `minimax-portal` (OAuth)
|
||||
- Auth: `MINIMAX_API_KEY` for `minimax`; `MINIMAX_OAUTH_TOKEN` or `MINIMAX_API_KEY` for `minimax-portal`
|
||||
- Example model: `minimax/MiniMax-M2.7` or `minimax-portal/MiniMax-M2.7`
|
||||
- MiniMax onboarding/API-key setup writes explicit M2.7 model definitions with
|
||||
`input: ["text", "image"]`; the bundled provider catalog keeps the chat refs
|
||||
text-only until that provider config is materialized
|
||||
- Moonshot: `moonshot` (`MOONSHOT_API_KEY`)
|
||||
- Example model: `moonshot/kimi-k2.6`
|
||||
- Kimi Coding: `kimi` (`KIMI_API_KEY` or `KIMICODE_API_KEY`)
|
||||
- Example model: `kimi/kimi-code`
|
||||
- Qianfan: `qianfan` (`QIANFAN_API_KEY`)
|
||||
- Example model: `qianfan/deepseek-v3.2`
|
||||
- Qwen Cloud: `qwen` (`QWEN_API_KEY`, `MODELSTUDIO_API_KEY`, or `DASHSCOPE_API_KEY`)
|
||||
- Example model: `qwen/qwen3.5-plus`
|
||||
- NVIDIA: `nvidia` (`NVIDIA_API_KEY`)
|
||||
- Example model: `nvidia/nvidia/llama-3.1-nemotron-70b-instruct`
|
||||
- StepFun: `stepfun` / `stepfun-plan` (`STEPFUN_API_KEY`)
|
||||
- Example models: `stepfun/step-3.5-flash`, `stepfun-plan/step-3.5-flash-2603`
|
||||
- Together: `together` (`TOGETHER_API_KEY`)
|
||||
- Example model: `together/moonshotai/Kimi-K2.5`
|
||||
- Venice: `venice` (`VENICE_API_KEY`)
|
||||
- Xiaomi: `xiaomi` (`XIAOMI_API_KEY`)
|
||||
- Example model: `xiaomi/mimo-v2-flash`
|
||||
- Vercel AI Gateway: `vercel-ai-gateway` (`AI_GATEWAY_API_KEY`)
|
||||
- Hugging Face Inference: `huggingface` (`HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN`)
|
||||
- Cloudflare AI Gateway: `cloudflare-ai-gateway` (`CLOUDFLARE_AI_GATEWAY_API_KEY`)
|
||||
- Volcengine: `volcengine` (`VOLCANO_ENGINE_API_KEY`)
|
||||
- Example model: `volcengine-plan/ark-code-latest`
|
||||
- BytePlus: `byteplus` (`BYTEPLUS_API_KEY`)
|
||||
- Example model: `byteplus-plan/ark-code-latest`
|
||||
- xAI: `xai` (`XAI_API_KEY`)
|
||||
- Native bundled xAI requests use the xAI Responses path
|
||||
- `/fast` or `params.fastMode: true` rewrites `grok-3`, `grok-3-mini`,
|
||||
`grok-4`, and `grok-4-0709` to their `*-fast` variants
|
||||
- `tool_stream` defaults on; set
|
||||
`agents.defaults.models["xai/<model>"].params.tool_stream` to `false` to
|
||||
disable it
|
||||
- Mistral: `mistral` (`MISTRAL_API_KEY`)
|
||||
- Example model: `mistral/mistral-large-latest`
|
||||
- CLI: `openclaw onboard --auth-choice mistral-api-key`
|
||||
- Groq: `groq` (`GROQ_API_KEY`)
|
||||
- Cerebras: `cerebras` (`CEREBRAS_API_KEY`)
|
||||
- GLM models on Cerebras use ids `zai-glm-4.7` and `zai-glm-4.6`.
|
||||
- OpenAI-compatible base URL: `https://api.cerebras.ai/v1`.
|
||||
- GitHub Copilot: `github-copilot` (`COPILOT_GITHUB_TOKEN` / `GH_TOKEN` / `GITHUB_TOKEN`)
|
||||
- Hugging Face Inference example model: `huggingface/deepseek-ai/DeepSeek-R1`; CLI: `openclaw onboard --auth-choice huggingface-api-key`. See [Hugging Face (Inference)](/providers/huggingface).
|
||||
| Provider | Id | Auth env | Example model |
|
||||
| ----------------------- | -------------------------------- | ------------------------------------------------------------ | ----------------------------------------------- |
|
||||
| BytePlus | `byteplus` / `byteplus-plan` | `BYTEPLUS_API_KEY` | `byteplus-plan/ark-code-latest` |
|
||||
| Cerebras | `cerebras` | `CEREBRAS_API_KEY` | `cerebras/zai-glm-4.7` |
|
||||
| Cloudflare AI Gateway | `cloudflare-ai-gateway` | `CLOUDFLARE_AI_GATEWAY_API_KEY` | — |
|
||||
| GitHub Copilot | `github-copilot` | `COPILOT_GITHUB_TOKEN` / `GH_TOKEN` / `GITHUB_TOKEN` | — |
|
||||
| Groq | `groq` | `GROQ_API_KEY` | — |
|
||||
| Hugging Face Inference | `huggingface` | `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN` | `huggingface/deepseek-ai/DeepSeek-R1` |
|
||||
| Kilo Gateway | `kilocode` | `KILOCODE_API_KEY` | `kilocode/kilo/auto` |
|
||||
| Kimi Coding | `kimi` | `KIMI_API_KEY` or `KIMICODE_API_KEY` | `kimi/kimi-code` |
|
||||
| MiniMax | `minimax` / `minimax-portal` | `MINIMAX_API_KEY` / `MINIMAX_OAUTH_TOKEN` | `minimax/MiniMax-M2.7` |
|
||||
| Mistral | `mistral` | `MISTRAL_API_KEY` | `mistral/mistral-large-latest` |
|
||||
| Moonshot | `moonshot` | `MOONSHOT_API_KEY` | `moonshot/kimi-k2.6` |
|
||||
| NVIDIA | `nvidia` | `NVIDIA_API_KEY` | `nvidia/nvidia/llama-3.1-nemotron-70b-instruct` |
|
||||
| OpenRouter | `openrouter` | `OPENROUTER_API_KEY` | `openrouter/auto` |
|
||||
| Qianfan | `qianfan` | `QIANFAN_API_KEY` | `qianfan/deepseek-v3.2` |
|
||||
| Qwen Cloud | `qwen` | `QWEN_API_KEY` / `MODELSTUDIO_API_KEY` / `DASHSCOPE_API_KEY` | `qwen/qwen3.5-plus` |
|
||||
| StepFun | `stepfun` / `stepfun-plan` | `STEPFUN_API_KEY` | `stepfun/step-3.5-flash` |
|
||||
| Together | `together` | `TOGETHER_API_KEY` | `together/moonshotai/Kimi-K2.5` |
|
||||
| Venice | `venice` | `VENICE_API_KEY` | — |
|
||||
| Vercel AI Gateway | `vercel-ai-gateway` | `AI_GATEWAY_API_KEY` | `vercel-ai-gateway/anthropic/claude-opus-4.6` |
|
||||
| Volcano Engine (Doubao) | `volcengine` / `volcengine-plan` | `VOLCANO_ENGINE_API_KEY` | `volcengine-plan/ark-code-latest` |
|
||||
| xAI | `xai` | `XAI_API_KEY` | `xai/grok-4` |
|
||||
| Xiaomi | `xiaomi` | `XIAOMI_API_KEY` | `xiaomi/mimo-v2-flash` |
|
||||
|
||||
Quirks worth knowing:
|
||||
|
||||
- **OpenRouter** applies its app-attribution headers and Anthropic `cache_control` markers only on verified `openrouter.ai` routes. As a proxy-style OpenAI-compatible path, it skips native-OpenAI-only shaping (`serviceTier`, Responses `store`, prompt-cache hints, OpenAI reasoning-compat). Gemini-backed refs keep proxy-Gemini thought-signature sanitation only.
|
||||
- **Kilo Gateway** Gemini-backed refs follow the same proxy-Gemini sanitation path; `kilocode/kilo/auto` and other proxy-reasoning-unsupported refs skip proxy reasoning injection.
|
||||
- **MiniMax** API-key onboarding writes explicit M2.7 model definitions with `input: ["text", "image"]`; the bundled catalog keeps chat refs text-only until that config is materialized.
|
||||
- **xAI** uses the xAI Responses path. `/fast` or `params.fastMode: true` rewrites `grok-3`, `grok-3-mini`, `grok-4`, and `grok-4-0709` to their `*-fast` variants. `tool_stream` defaults on; disable via `agents.defaults.models["xai/<model>"].params.tool_stream=false`.
|
||||
- **Cerebras** GLM models use `zai-glm-4.7` / `zai-glm-4.6`; OpenAI-compatible base URL is `https://api.cerebras.ai/v1`.
|
||||
|
||||
## Providers via `models.providers` (custom/base URL)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user