--- summary: "Model provider overview with example configs + CLI flows" read_when: - You need a provider-by-provider model setup reference - You want example configs or CLI onboarding commands for model providers title: "Model Providers" --- # Model providers This page covers **LLM/model providers** (not chat channels like WhatsApp/Telegram). For model selection rules, see [/concepts/models](/concepts/models). ## Quick rules - Model refs use `provider/model` (example: `opencode/claude-opus-4-6`). - If you set `agents.defaults.models`, it becomes the allowlist. - CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set `. - Fallback runtime rules, cooldown probes, and session-override persistence are documented in [/concepts/model-failover](/concepts/model-failover). - `models.providers.*.models[].contextWindow` is native model metadata; `models.providers.*.models[].contextTokens` is the effective runtime cap. - Provider plugins can inject model catalogs via `registerProvider({ catalog })`; OpenClaw merges that output into `models.providers` before writing `models.json`. - Provider manifests can declare `providerAuthEnvVars` so generic env-based auth probes do not need to load plugin runtime. The remaining core env-var map is now just for non-plugin/core providers and a few generic-precedence cases such as Anthropic API-key-first onboarding. - Provider plugins can also own provider runtime behavior via `normalizeConfig`, `resolveDynamicModel`, `prepareDynamicModel`, `normalizeResolvedModel`, `capabilities`, `prepareExtraParams`, `wrapStreamFn`, `formatApiKey`, `refreshOAuth`, `buildAuthDoctorHint`, `matchesContextOverflowError`, `classifyFailoverReason`, `isCacheTtlEligible`, `buildMissingAuthMessage`, `suppressBuiltInModel`, `augmentModelCatalog`, `isBinaryThinking`, `supportsXHighThinking`, `resolveDefaultThinkingLevel`, `applyConfigDefaults`, `isModernModelRef`, `prepareRuntimeAuth`, `resolveUsageAuth`, and `fetchUsageSnapshot`. - Note: provider runtime `capabilities` is shared runner metadata (provider family, transcript/tooling quirks, transport/cache hints). It is not the same as the [public capability model](/plugins/architecture#public-capability-model) which describes what a plugin registers (text inference, speech, etc.). ## Plugin-owned provider behavior Provider plugins can now own most provider-specific logic while OpenClaw keeps the generic inference loop. Typical split: - `auth[].run` / `auth[].runNonInteractive`: provider owns onboarding/login flows for `openclaw onboard`, `openclaw models auth`, and headless setup - `wizard.setup` / `wizard.modelPicker`: provider owns auth-choice labels, legacy aliases, onboarding allowlist hints, and setup entries in onboarding/model pickers - `catalog`: provider appears in `models.providers` - `normalizeConfig`: provider normalizes `models.providers.` config before runtime uses it - `resolveDynamicModel`: provider accepts model ids not present in the local static catalog yet - `prepareDynamicModel`: provider needs a metadata refresh before retrying dynamic resolution - `normalizeResolvedModel`: provider needs transport or base URL rewrites - `capabilities`: provider publishes transcript/tooling/provider-family quirks - `prepareExtraParams`: provider defaults or normalizes per-model request params - `wrapStreamFn`: provider applies request headers/body/model compat wrappers - `formatApiKey`: provider formats stored auth profiles into the runtime `apiKey` string expected by the transport - `refreshOAuth`: provider owns OAuth refresh when the shared `pi-ai` refreshers are not enough - `buildAuthDoctorHint`: provider appends repair guidance when OAuth refresh fails - `matchesContextOverflowError`: provider recognizes provider-specific context-window overflow errors that generic heuristics would miss - `classifyFailoverReason`: provider maps provider-specific raw transport/API errors to failover reasons such as rate limit or overload - `isCacheTtlEligible`: provider decides which upstream model ids support prompt-cache TTL - `buildMissingAuthMessage`: provider replaces the generic auth-store error with a provider-specific recovery hint - `suppressBuiltInModel`: provider hides stale upstream rows and can return a vendor-owned error for direct resolution failures - `augmentModelCatalog`: provider appends synthetic/final catalog rows after discovery and config merging - `isBinaryThinking`: provider owns binary on/off thinking UX - `supportsXHighThinking`: provider opts selected models into `xhigh` - `resolveDefaultThinkingLevel`: provider owns default `/think` policy for a model family - `applyConfigDefaults`: provider applies provider-specific global defaults during config materialization based on auth mode, env, or model family - `isModernModelRef`: provider owns live/smoke preferred-model matching - `prepareRuntimeAuth`: provider turns a configured credential into a short lived runtime token - `resolveUsageAuth`: provider resolves usage/quota credentials for `/usage` and related status/reporting surfaces - `fetchUsageSnapshot`: provider owns the usage endpoint fetch/parsing while core still owns the summary shell and formatting Current bundled examples: - `anthropic`: Claude 4.6 forward-compat fallback, auth repair hints, usage endpoint fetching, cache-TTL/provider-family metadata, and auth-aware global config defaults - `amazon-bedrock`: provider-owned context-overflow matching and failover reason classification for Bedrock-specific throttle/not-ready errors - `openrouter`: pass-through model ids, request wrappers, provider capability hints, and cache-TTL policy - `github-copilot`: onboarding/device login, forward-compat model fallback, Claude-thinking transcript hints, runtime token exchange, and usage endpoint fetching - `openai`: GPT-5.4 forward-compat fallback, direct OpenAI transport normalization, Codex-aware missing-auth hints, Spark suppression, synthetic OpenAI/Codex catalog rows, thinking/live-model policy, and provider-family metadata - `google` and `google-gemini-cli`: Gemini 3.1 forward-compat fallback and modern-model matching; Gemini CLI OAuth also owns auth-profile token formatting, usage-token parsing, and quota endpoint fetching for usage surfaces - `moonshot`: shared transport, plugin-owned thinking payload normalization - `kilocode`: shared transport, plugin-owned request headers, reasoning payload normalization, Gemini transcript hints, and cache-TTL policy - `zai`: GLM-5 forward-compat fallback, `tool_stream` defaults, cache-TTL policy, binary-thinking/live-model policy, and usage auth + quota fetching - `mistral`, `opencode`, and `opencode-go`: plugin-owned capability metadata - `byteplus`, `cloudflare-ai-gateway`, `huggingface`, `kimi`, `modelstudio`, `nvidia`, `qianfan`, `stepfun`, `synthetic`, `together`, `venice`, `vercel-ai-gateway`, and `volcengine`: plugin-owned catalogs only - `minimax` and `xiaomi`: plugin-owned catalogs plus usage auth/snapshot logic The bundled `openai` plugin now owns both provider ids: `openai` and `openai-codex`. That covers providers that still fit OpenClaw's normal transports. A provider that needs a totally custom request executor is a separate, deeper extension surface. ## API key rotation - Supports generic provider rotation for selected providers. - Configure multiple keys via: - `OPENCLAW_LIVE__KEY` (single live override, highest priority) - `_API_KEYS` (comma or semicolon list) - `_API_KEY` (primary key) - `_API_KEY_*` (numbered list, e.g. `_API_KEY_1`) - For Google providers, `GOOGLE_API_KEY` is also included as fallback. - Key selection order preserves priority and deduplicates values. - Requests are retried with the next key only on rate-limit responses (for example `429`, `rate_limit`, `quota`, `resource exhausted`). - Non-rate-limit failures fail immediately; no key rotation is attempted. - When all candidate keys fail, the final error is returned from the last attempt. ## Built-in providers (pi-ai catalog) OpenClaw ships with the pi‑ai catalog. These providers require **no** `models.providers` config; just set auth + pick a model. ### OpenAI - Provider: `openai` - Auth: `OPENAI_API_KEY` - Optional rotation: `OPENAI_API_KEYS`, `OPENAI_API_KEY_1`, `OPENAI_API_KEY_2`, plus `OPENCLAW_LIVE_OPENAI_KEY` (single override) - Example models: `openai/gpt-5.4`, `openai/gpt-5.4-pro` - CLI: `openclaw onboard --auth-choice openai-api-key` - Default transport is `auto` (WebSocket-first, SSE fallback) - Override per model via `agents.defaults.models["openai/"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`) - OpenAI Responses WebSocket warm-up defaults to enabled via `params.openaiWsWarmup` (`true`/`false`) - OpenAI priority processing can be enabled via `agents.defaults.models["openai/"].params.serviceTier` - `/fast` and `params.fastMode` map direct `openai/*` Responses requests to `service_tier=priority` on `api.openai.com` - Use `params.serviceTier` when you want an explicit tier instead of the shared `/fast` toggle - Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) apply only on native OpenAI traffic to `api.openai.com`, not generic OpenAI-compatible proxies - Native OpenAI routes also keep Responses `store`, prompt-cache hints, and OpenAI reasoning-compat payload shaping; proxy routes do not - `openai/gpt-5.3-codex-spark` is intentionally suppressed in OpenClaw because the live OpenAI API rejects it; Spark is treated as Codex-only ```json5 { agents: { defaults: { model: { primary: "openai/gpt-5.4" } } }, } ``` ### Anthropic - Provider: `anthropic` - Auth: `ANTHROPIC_API_KEY` - Optional rotation: `ANTHROPIC_API_KEYS`, `ANTHROPIC_API_KEY_1`, `ANTHROPIC_API_KEY_2`, plus `OPENCLAW_LIVE_ANTHROPIC_KEY` (single override) - Example model: `anthropic/claude-opus-4-6` - CLI: `openclaw onboard --auth-choice apiKey` or `openclaw onboard --auth-choice anthropic-cli` - Direct public Anthropic requests support the shared `/fast` toggle and `params.fastMode`, including API-key and OAuth-authenticated traffic sent to `api.anthropic.com`; OpenClaw maps that to Anthropic `service_tier` (`auto` vs `standard_only`) - Billing note: Anthropic changed third-party harness billing on **April 4, 2026 at 12:00 PM PT / 8:00 PM BST**. Anthropic says Claude subscription limits no longer cover OpenClaw, and Claude CLI traffic now requires **Extra Usage** billed separately from the subscription. - Existing legacy Anthropic token profiles still run if already configured, but new setup is no longer offered through onboarding or auth commands. ```json5 { agents: { defaults: { model: { primary: "anthropic/claude-opus-4-6" } } }, } ``` ### OpenAI Code (Codex) - Provider: `openai-codex` - Auth: OAuth (ChatGPT) - Example model: `openai-codex/gpt-5.4` - CLI: `openclaw onboard --auth-choice openai-codex` or `openclaw models auth login --provider openai-codex` - Default transport is `auto` (WebSocket-first, SSE fallback) - Override per model via `agents.defaults.models["openai-codex/"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`) - `params.serviceTier` is also forwarded on native Codex Responses requests (`chatgpt.com/backend-api`) - Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) are only attached on native Codex traffic to `chatgpt.com/backend-api`, not generic OpenAI-compatible proxies - Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`; OpenClaw maps that to `service_tier=priority` - `openai-codex/gpt-5.3-codex-spark` remains available when the Codex OAuth catalog exposes it; entitlement-dependent - `openai-codex/gpt-5.4` keeps native `contextWindow = 1050000` and a default runtime `contextTokens = 272000`; override the runtime cap with `models.providers.openai-codex.models[].contextTokens` - Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw. ```json5 { agents: { defaults: { model: { primary: "openai-codex/gpt-5.4" } } }, } ``` ```json5 { models: { providers: { "openai-codex": { models: [{ id: "gpt-5.4", contextTokens: 160000 }], }, }, }, } ``` ### Other subscription-style hosted options - [Qwen / Model Studio](/providers/qwen_modelstudio): Alibaba Cloud Standard pay-as-you-go and Coding Plan subscription endpoints - [MiniMax](/providers/minimax): MiniMax Coding Plan OAuth or API key access - [GLM Models](/providers/glm): Z.AI Coding Plan or general API endpoints ### OpenCode - Auth: `OPENCODE_API_KEY` (or `OPENCODE_ZEN_API_KEY`) - Zen runtime provider: `opencode` - Go runtime provider: `opencode-go` - Example models: `opencode/claude-opus-4-6`, `opencode-go/kimi-k2.5` - CLI: `openclaw onboard --auth-choice opencode-zen` or `openclaw onboard --auth-choice opencode-go` ```json5 { agents: { defaults: { model: { primary: "opencode/claude-opus-4-6" } } }, } ``` ### Google Gemini (API key) - Provider: `google` - Auth: `GEMINI_API_KEY` - Optional rotation: `GEMINI_API_KEYS`, `GEMINI_API_KEY_1`, `GEMINI_API_KEY_2`, `GOOGLE_API_KEY` fallback, and `OPENCLAW_LIVE_GEMINI_KEY` (single override) - Example models: `google/gemini-3.1-pro-preview`, `google/gemini-3-flash-preview` - Compatibility: legacy OpenClaw config using `google/gemini-3.1-flash-preview` is normalized to `google/gemini-3-flash-preview` - CLI: `openclaw onboard --auth-choice gemini-api-key` ### Google Vertex and Gemini CLI - Providers: `google-vertex`, `google-gemini-cli` - Auth: Vertex uses gcloud ADC; Gemini CLI uses its OAuth flow - Caution: Gemini CLI OAuth in OpenClaw is an unofficial integration. Some users have reported Google account restrictions after using third-party clients. Review Google terms and use a non-critical account if you choose to proceed. - Gemini CLI OAuth is shipped as part of the bundled `google` plugin. - Enable: `openclaw plugins enable google` - Login: `openclaw models auth login --provider google-gemini-cli --set-default` - Note: you do **not** paste a client id or secret into `openclaw.json`. The CLI login flow stores tokens in auth profiles on the gateway host. ### Z.AI (GLM) - Provider: `zai` - Auth: `ZAI_API_KEY` - Example model: `zai/glm-5` - CLI: `openclaw onboard --auth-choice zai-api-key` - Aliases: `z.ai/*` and `z-ai/*` normalize to `zai/*` - `zai-api-key` auto-detects the matching Z.AI endpoint; `zai-coding-global`, `zai-coding-cn`, `zai-global`, and `zai-cn` force a specific surface ### Vercel AI Gateway - Provider: `vercel-ai-gateway` - Auth: `AI_GATEWAY_API_KEY` - Example model: `vercel-ai-gateway/anthropic/claude-opus-4.6` - CLI: `openclaw onboard --auth-choice ai-gateway-api-key` ### Kilo Gateway - Provider: `kilocode` - Auth: `KILOCODE_API_KEY` - Example model: `kilocode/kilo/auto` - CLI: `openclaw onboard --auth-choice kilocode-api-key` - Base URL: `https://api.kilo.ai/api/gateway/` - Static fallback catalog ships `kilocode/kilo/auto`; live `https://api.kilo.ai/api/gateway/models` discovery can expand the runtime catalog further. - Exact upstream routing behind `kilocode/kilo/auto` is owned by Kilo Gateway, not hard-coded in OpenClaw. See [/providers/kilocode](/providers/kilocode) for setup details. ### Other bundled provider plugins - OpenRouter: `openrouter` (`OPENROUTER_API_KEY`) - Example model: `openrouter/auto` - OpenClaw applies OpenRouter's documented app-attribution headers only when the request actually targets `openrouter.ai` - OpenRouter remains on the proxy-style OpenAI-compatible path, so native OpenAI-only request shaping (`serviceTier`, Responses `store`, prompt-cache hints, OpenAI reasoning-compat payloads) is not forwarded - Kilo Gateway: `kilocode` (`KILOCODE_API_KEY`) - Example model: `kilocode/kilo/auto` - MiniMax: `minimax` (`MINIMAX_API_KEY`) - Example model: `minimax/MiniMax-M2.7` - MiniMax onboarding/API-key setup writes explicit M2.7 model definitions with `input: ["text", "image"]`; the bundled provider catalog keeps the chat refs text-only until that provider config is materialized - Moonshot: `moonshot` (`MOONSHOT_API_KEY`) - Example model: `moonshot/kimi-k2.5` - Kimi Coding: `kimi` (`KIMI_API_KEY` or `KIMICODE_API_KEY`) - Example model: `kimi/kimi-code` - Qianfan: `qianfan` (`QIANFAN_API_KEY`) - Example model: `qianfan/deepseek-v3.2` - Model Studio: `modelstudio` (`MODELSTUDIO_API_KEY`) - Example model: `modelstudio/qwen3.5-plus` - NVIDIA: `nvidia` (`NVIDIA_API_KEY`) - Example model: `nvidia/nvidia/llama-3.1-nemotron-70b-instruct` - StepFun: `stepfun` / `stepfun-plan` (`STEPFUN_API_KEY`) - Example models: `stepfun/step-3.5-flash`, `stepfun-plan/step-3.5-flash-2603` - Together: `together` (`TOGETHER_API_KEY`) - Example model: `together/moonshotai/Kimi-K2.5` - Venice: `venice` (`VENICE_API_KEY`) - Xiaomi: `xiaomi` (`XIAOMI_API_KEY`) - Example model: `xiaomi/mimo-v2-flash` - Vercel AI Gateway: `vercel-ai-gateway` (`AI_GATEWAY_API_KEY`) - Hugging Face Inference: `huggingface` (`HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN`) - Cloudflare AI Gateway: `cloudflare-ai-gateway` (`CLOUDFLARE_AI_GATEWAY_API_KEY`) - Volcengine: `volcengine` (`VOLCANO_ENGINE_API_KEY`) - Example model: `volcengine-plan/ark-code-latest` - BytePlus: `byteplus` (`BYTEPLUS_API_KEY`) - Example model: `byteplus-plan/ark-code-latest` - xAI: `xai` (`XAI_API_KEY`) - Mistral: `mistral` (`MISTRAL_API_KEY`) - Example model: `mistral/mistral-large-latest` - CLI: `openclaw onboard --auth-choice mistral-api-key` - Groq: `groq` (`GROQ_API_KEY`) - Cerebras: `cerebras` (`CEREBRAS_API_KEY`) - GLM models on Cerebras use ids `zai-glm-4.7` and `zai-glm-4.6`. - OpenAI-compatible base URL: `https://api.cerebras.ai/v1`. - GitHub Copilot: `github-copilot` (`COPILOT_GITHUB_TOKEN` / `GH_TOKEN` / `GITHUB_TOKEN`) - Hugging Face Inference example model: `huggingface/deepseek-ai/DeepSeek-R1`; CLI: `openclaw onboard --auth-choice huggingface-api-key`. See [Hugging Face (Inference)](/providers/huggingface). ## Providers via `models.providers` (custom/base URL) Use `models.providers` (or `models.json`) to add **custom** providers or OpenAI/Anthropic‑compatible proxies. Many of the bundled provider plugins below already publish a default catalog. Use explicit `models.providers.` entries only when you want to override the default base URL, headers, or model list. ### Moonshot AI (Kimi) Moonshot ships as a bundled provider plugin. Use the built-in provider by default, and add an explicit `models.providers.moonshot` entry only when you need to override the base URL or model metadata: - Provider: `moonshot` - Auth: `MOONSHOT_API_KEY` - Example model: `moonshot/kimi-k2.5` - CLI: `openclaw onboard --auth-choice moonshot-api-key` or `openclaw onboard --auth-choice moonshot-api-key-cn` Kimi K2 model IDs: [//]: # "moonshot-kimi-k2-model-refs:start" - `moonshot/kimi-k2.5` - `moonshot/kimi-k2-thinking` - `moonshot/kimi-k2-thinking-turbo` - `moonshot/kimi-k2-turbo` [//]: # "moonshot-kimi-k2-model-refs:end" ```json5 { agents: { defaults: { model: { primary: "moonshot/kimi-k2.5" } }, }, models: { mode: "merge", providers: { moonshot: { baseUrl: "https://api.moonshot.ai/v1", apiKey: "${MOONSHOT_API_KEY}", api: "openai-completions", models: [{ id: "kimi-k2.5", name: "Kimi K2.5" }], }, }, }, } ``` ### Kimi Coding Kimi Coding uses Moonshot AI's Anthropic-compatible endpoint: - Provider: `kimi` - Auth: `KIMI_API_KEY` - Example model: `kimi/kimi-code` ```json5 { env: { KIMI_API_KEY: "sk-..." }, agents: { defaults: { model: { primary: "kimi/kimi-code" } }, }, } ``` Legacy `kimi/k2p5` remains accepted as a compatibility model id. ### Volcano Engine (Doubao) Volcano Engine (火山引擎) provides access to Doubao and other models in China. - Provider: `volcengine` (coding: `volcengine-plan`) - Auth: `VOLCANO_ENGINE_API_KEY` - Example model: `volcengine-plan/ark-code-latest` - CLI: `openclaw onboard --auth-choice volcengine-api-key` ```json5 { agents: { defaults: { model: { primary: "volcengine-plan/ark-code-latest" } }, }, } ``` Onboarding defaults to the coding surface, but the general `volcengine/*` catalog is registered at the same time. Available models: - `volcengine/doubao-seed-1-8-251228` (Doubao Seed 1.8) - `volcengine/doubao-seed-code-preview-251028` - `volcengine/kimi-k2-5-260127` (Kimi K2.5) - `volcengine/glm-4-7-251222` (GLM 4.7) - `volcengine/deepseek-v3-2-251201` (DeepSeek V3.2 128K) Coding models (`volcengine-plan`): - `volcengine-plan/ark-code-latest` - `volcengine-plan/doubao-seed-code` - `volcengine-plan/kimi-k2.5` - `volcengine-plan/kimi-k2-thinking` - `volcengine-plan/glm-4.7` ### BytePlus (International) BytePlus ARK provides access to the same models as Volcano Engine for international users. - Provider: `byteplus` (coding: `byteplus-plan`) - Auth: `BYTEPLUS_API_KEY` - Example model: `byteplus-plan/ark-code-latest` - CLI: `openclaw onboard --auth-choice byteplus-api-key` ```json5 { agents: { defaults: { model: { primary: "byteplus-plan/ark-code-latest" } }, }, } ``` Onboarding defaults to the coding surface, but the general `byteplus/*` catalog is registered at the same time. Available models: - `byteplus/seed-1-8-251228` (Seed 1.8) - `byteplus/kimi-k2-5-260127` (Kimi K2.5) - `byteplus/glm-4-7-251222` (GLM 4.7) Coding models (`byteplus-plan`): - `byteplus-plan/ark-code-latest` - `byteplus-plan/doubao-seed-code` - `byteplus-plan/kimi-k2.5` - `byteplus-plan/kimi-k2-thinking` - `byteplus-plan/glm-4.7` ### Synthetic Synthetic provides Anthropic-compatible models behind the `synthetic` provider: - Provider: `synthetic` - Auth: `SYNTHETIC_API_KEY` - Example model: `synthetic/hf:MiniMaxAI/MiniMax-M2.5` - CLI: `openclaw onboard --auth-choice synthetic-api-key` ```json5 { agents: { defaults: { model: { primary: "synthetic/hf:MiniMaxAI/MiniMax-M2.5" } }, }, models: { mode: "merge", providers: { synthetic: { baseUrl: "https://api.synthetic.new/anthropic", apiKey: "${SYNTHETIC_API_KEY}", api: "anthropic-messages", models: [{ id: "hf:MiniMaxAI/MiniMax-M2.5", name: "MiniMax M2.5" }], }, }, }, } ``` ### MiniMax MiniMax is configured via `models.providers` because it uses custom endpoints: - MiniMax OAuth (Global): `--auth-choice minimax-global-oauth` - MiniMax OAuth (CN): `--auth-choice minimax-cn-oauth` - MiniMax API key (Global): `--auth-choice minimax-global-api` - MiniMax API key (CN): `--auth-choice minimax-cn-api` - Auth: `MINIMAX_API_KEY` See [/providers/minimax](/providers/minimax) for setup details, model options, and config snippets. ### Ollama Ollama ships as a bundled provider plugin and uses Ollama's native API: - Provider: `ollama` - Auth: None required (local server) - Example model: `ollama/llama3.3` - Installation: [https://ollama.com/download](https://ollama.com/download) ```bash # Install Ollama, then pull a model: ollama pull llama3.3 ``` ```json5 { agents: { defaults: { model: { primary: "ollama/llama3.3" } }, }, } ``` Ollama is detected locally at `http://127.0.0.1:11434` when you opt in with `OLLAMA_API_KEY`, and the bundled provider plugin adds Ollama directly to `openclaw onboard` and the model picker. See [/providers/ollama](/providers/ollama) for onboarding, cloud/local mode, and custom configuration. ### vLLM vLLM ships as a bundled provider plugin for local/self-hosted OpenAI-compatible servers: - Provider: `vllm` - Auth: Optional (depends on your server) - Default base URL: `http://127.0.0.1:8000/v1` To opt in to auto-discovery locally (any value works if your server doesn’t enforce auth): ```bash export VLLM_API_KEY="vllm-local" ``` Then set a model (replace with one of the IDs returned by `/v1/models`): ```json5 { agents: { defaults: { model: { primary: "vllm/your-model-id" } }, }, } ``` See [/providers/vllm](/providers/vllm) for details. ### SGLang SGLang ships as a bundled provider plugin for fast self-hosted OpenAI-compatible servers: - Provider: `sglang` - Auth: Optional (depends on your server) - Default base URL: `http://127.0.0.1:30000/v1` To opt in to auto-discovery locally (any value works if your server does not enforce auth): ```bash export SGLANG_API_KEY="sglang-local" ``` Then set a model (replace with one of the IDs returned by `/v1/models`): ```json5 { agents: { defaults: { model: { primary: "sglang/your-model-id" } }, }, } ``` See [/providers/sglang](/providers/sglang) for details. ### Local proxies (LM Studio, vLLM, LiteLLM, etc.) Example (OpenAI‑compatible): ```json5 { agents: { defaults: { model: { primary: "lmstudio/my-local-model" }, models: { "lmstudio/my-local-model": { alias: "Local" } }, }, }, models: { providers: { lmstudio: { baseUrl: "http://localhost:1234/v1", apiKey: "LMSTUDIO_KEY", api: "openai-completions", models: [ { id: "my-local-model", name: "Local Model", reasoning: false, input: ["text"], cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, contextWindow: 200000, maxTokens: 8192, }, ], }, }, }, } ``` Notes: - For custom providers, `reasoning`, `input`, `cost`, `contextWindow`, and `maxTokens` are optional. When omitted, OpenClaw defaults to: - `reasoning: false` - `input: ["text"]` - `cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }` - `contextWindow: 200000` - `maxTokens: 8192` - Recommended: set explicit values that match your proxy/model limits. - For `api: "openai-completions"` on non-native endpoints (any non-empty `baseUrl` whose host is not `api.openai.com`), OpenClaw forces `compat.supportsDeveloperRole: false` to avoid provider 400 errors for unsupported `developer` roles. - Proxy-style OpenAI-compatible routes also skip native OpenAI-only request shaping: no `service_tier`, no Responses `store`, no prompt-cache hints, no OpenAI reasoning-compat payload shaping, and no hidden OpenClaw attribution headers. - If `baseUrl` is empty/omitted, OpenClaw keeps the default OpenAI behavior (which resolves to `api.openai.com`). - For safety, an explicit `compat.supportsDeveloperRole: true` is still overridden on non-native `openai-completions` endpoints. ## CLI examples ```bash openclaw onboard --auth-choice opencode-zen openclaw models set opencode/claude-opus-4-6 openclaw models list ``` See also: [/gateway/configuration](/gateway/configuration) for full configuration examples. ## Related - [Models](/concepts/models) — model configuration and aliases - [Model Failover](/concepts/model-failover) — fallback chains and retry behavior - [Configuration Reference](/gateway/configuration-reference#agent-defaults) — model config keys - [Providers](/providers) — per-provider setup guides