diff --git a/docs/providers/comfy.md b/docs/providers/comfy.md index c4ffb55afd6..769ed1652a6 100644 --- a/docs/providers/comfy.md +++ b/docs/providers/comfy.md @@ -9,13 +9,15 @@ read_when: # ComfyUI -OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs. +OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs. The plugin is entirely workflow-driven, so OpenClaw does not try to map generic `size`, `aspectRatio`, `resolution`, `durationSeconds`, or TTS-style controls onto your graph. -- Provider: `comfy` -- Models: `comfy/workflow` -- Shared surfaces: `image_generate`, `video_generate`, `music_generate` -- Auth: none for local ComfyUI; `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for Comfy Cloud -- API: ComfyUI `/prompt` / `/history` / `/view` and Comfy Cloud `/api/*` +| Property | Detail | +| --------------- | -------------------------------------------------------------------------------- | +| Provider | `comfy` | +| Models | `comfy/workflow` | +| Shared surfaces | `image_generate`, `video_generate`, `music_generate` | +| Auth | None for local ComfyUI; `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for Comfy Cloud | +| API | ComfyUI `/prompt` / `/history` / `/view` and Comfy Cloud `/api/*` | ## What it supports @@ -26,14 +28,140 @@ OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs. - Music or audio generation through the shared `music_generate` tool - Output download from a configured node or all matching output nodes -The bundled plugin is workflow-driven, so OpenClaw does not try to map generic -`size`, `aspectRatio`, `resolution`, `durationSeconds`, or TTS-style controls -onto your graph. +## Getting started -## Config layout +Choose between running ComfyUI on your own machine or using Comfy Cloud. -Comfy supports shared top-level connection settings plus per-capability workflow -sections: + + + **Best for:** running your own ComfyUI instance on your machine or LAN. + + + + Make sure your local ComfyUI instance is running (defaults to `http://127.0.0.1:8188`). + + + Export or create a ComfyUI workflow JSON file. Note the node IDs for the prompt input node and the output node you want OpenClaw to read from. + + + Set `mode: "local"` and point at your workflow file. Here is a minimal image example: + + ```json5 + { + models: { + providers: { + comfy: { + mode: "local", + baseUrl: "http://127.0.0.1:8188", + image: { + workflowPath: "./workflows/flux-api.json", + promptNodeId: "6", + outputNodeId: "9", + }, + }, + }, + }, + } + ``` + + + Point OpenClaw at the `comfy/workflow` model for the capability you configured: + + ```json5 + { + agents: { + defaults: { + imageGenerationModel: { + primary: "comfy/workflow", + }, + }, + }, + } + ``` + + + ```bash + openclaw models list --provider comfy + ``` + + + + + + + **Best for:** running workflows on Comfy Cloud without managing local GPU resources. + + + + Sign up at [comfy.org](https://comfy.org) and generate an API key from your account dashboard. + + + Provide your key through one of these methods: + + ```bash + # Environment variable (preferred) + export COMFY_API_KEY="your-key" + + # Alternative environment variable + export COMFY_CLOUD_API_KEY="your-key" + + # Or inline in config + openclaw config set models.providers.comfy.apiKey "your-key" + ``` + + + Export or create a ComfyUI workflow JSON file. Note the node IDs for the prompt input node and the output node. + + + Set `mode: "cloud"` and point at your workflow file: + + ```json5 + { + models: { + providers: { + comfy: { + mode: "cloud", + image: { + workflowPath: "./workflows/flux-api.json", + promptNodeId: "6", + outputNodeId: "9", + }, + }, + }, + }, + } + ``` + + + Cloud mode defaults `baseUrl` to `https://cloud.comfy.org`. You only need to set `baseUrl` if you use a custom cloud endpoint. + + + + ```json5 + { + agents: { + defaults: { + imageGenerationModel: { + primary: "comfy/workflow", + }, + }, + }, + } + ``` + + + ```bash + openclaw models list --provider comfy + ``` + + + + + + +## Configuration + +Comfy supports shared top-level connection settings plus per-capability workflow sections (`image`, `video`, `music`): ```json5 { @@ -63,139 +191,164 @@ sections: } ``` -Shared keys: +### Shared keys -- `mode`: `local` or `cloud` -- `baseUrl`: defaults to `http://127.0.0.1:8188` for local or `https://cloud.comfy.org` for cloud -- `apiKey`: optional inline key alternative to env vars -- `allowPrivateNetwork`: allow a private/LAN `baseUrl` in cloud mode +| Key | Type | Description | +| --------------------- | ---------------------- | ------------------------------------------------------------------------------------- | +| `mode` | `"local"` or `"cloud"` | Connection mode. | +| `baseUrl` | string | Defaults to `http://127.0.0.1:8188` for local or `https://cloud.comfy.org` for cloud. | +| `apiKey` | string | Optional inline key, alternative to `COMFY_API_KEY` / `COMFY_CLOUD_API_KEY` env vars. | +| `allowPrivateNetwork` | boolean | Allow a private/LAN `baseUrl` in cloud mode. | -Per-capability keys under `image`, `video`, or `music`: +### Per-capability keys -- `workflow` or `workflowPath`: required -- `promptNodeId`: required -- `promptInputName`: defaults to `text` -- `outputNodeId`: optional -- `pollIntervalMs`: optional -- `timeoutMs`: optional +These keys apply inside the `image`, `video`, or `music` sections: -Image and video sections also support: +| Key | Required | Default | Description | +| ---------------------------- | -------- | -------- | ---------------------------------------------------------------------------- | +| `workflow` or `workflowPath` | Yes | -- | Path to the ComfyUI workflow JSON file. | +| `promptNodeId` | Yes | -- | Node ID that receives the text prompt. | +| `promptInputName` | No | `"text"` | Input name on the prompt node. | +| `outputNodeId` | No | -- | Node ID to read output from. If omitted, all matching output nodes are used. | +| `pollIntervalMs` | No | -- | Polling interval in milliseconds for job completion. | +| `timeoutMs` | No | -- | Timeout in milliseconds for the workflow run. | -- `inputImageNodeId`: required when you pass a reference image -- `inputImageInputName`: defaults to `image` +The `image` and `video` sections also support: -## Backward compatibility +| Key | Required | Default | Description | +| --------------------- | ------------------------------------ | --------- | --------------------------------------------------- | +| `inputImageNodeId` | Yes (when passing a reference image) | -- | Node ID that receives the uploaded reference image. | +| `inputImageInputName` | No | `"image"` | Input name on the image node. | -Existing top-level image config still works: +## Workflow details -```json5 -{ - models: { - providers: { - comfy: { - workflowPath: "./workflows/flux-api.json", - promptNodeId: "6", - outputNodeId: "9", - }, - }, - }, -} -``` + + + Set the default image model to `comfy/workflow`: -OpenClaw treats that legacy shape as the image workflow config. - -## Image workflows - -Set the default image model: - -```json5 -{ - agents: { - defaults: { - imageGenerationModel: { - primary: "comfy/workflow", - }, - }, - }, -} -``` - -Reference-image editing example: - -```json5 -{ - models: { - providers: { - comfy: { - image: { - workflowPath: "./workflows/edit-api.json", - promptNodeId: "6", - inputImageNodeId: "7", - inputImageInputName: "image", - outputNodeId: "9", + ```json5 + { + agents: { + defaults: { + imageGenerationModel: { + primary: "comfy/workflow", + }, }, }, - }, - }, -} -``` + } + ``` -## Video workflows + **Reference-image editing example:** -Set the default video model: + To enable image editing with an uploaded reference image, add `inputImageNodeId` to your image config: -```json5 -{ - agents: { - defaults: { - videoGenerationModel: { - primary: "comfy/workflow", + ```json5 + { + models: { + providers: { + comfy: { + image: { + workflowPath: "./workflows/edit-api.json", + promptNodeId: "6", + inputImageNodeId: "7", + inputImageInputName: "image", + outputNodeId: "9", + }, + }, + }, }, - }, - }, -} -``` + } + ``` -Comfy video workflows currently support text-to-video and image-to-video through -the configured graph. OpenClaw does not pass input videos into Comfy workflows. + -## Music workflows + + Set the default video model to `comfy/workflow`: -The bundled plugin registers a music-generation provider for workflow-defined -audio or music outputs, surfaced through the shared `music_generate` tool: + ```json5 + { + agents: { + defaults: { + videoGenerationModel: { + primary: "comfy/workflow", + }, + }, + }, + } + ``` -```text -/tool music_generate prompt="Warm ambient synth loop with soft tape texture" -``` + Comfy video workflows support text-to-video and image-to-video through the configured graph. -Use the `music` config section to point at your audio workflow JSON and output -node. + + OpenClaw does not pass input videos into Comfy workflows. Only text prompts and single reference images are supported as inputs. + -## Comfy Cloud + -Use `mode: "cloud"` plus one of: + + The bundled plugin registers a music-generation provider for workflow-defined audio or music outputs, surfaced through the shared `music_generate` tool: -- `COMFY_API_KEY` -- `COMFY_CLOUD_API_KEY` -- `models.providers.comfy.apiKey` + ```text + /tool music_generate prompt="Warm ambient synth loop with soft tape texture" + ``` -Cloud mode still uses the same `image`, `video`, and `music` workflow sections. + Use the `music` config section to point at your audio workflow JSON and output node. -## Live tests + -Opt-in live coverage exists for the bundled plugin: + + Existing top-level image config (without the nested `image` section) still works: -```bash -OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts -``` + ```json5 + { + models: { + providers: { + comfy: { + workflowPath: "./workflows/flux-api.json", + promptNodeId: "6", + outputNodeId: "9", + }, + }, + }, + } + ``` -The live test skips individual image, video, or music cases unless the matching -Comfy workflow section is configured. + OpenClaw treats that legacy shape as the image workflow config. You do not need to migrate immediately, but the nested `image` / `video` / `music` sections are recommended for new setups. + + + If you only use image generation, the legacy flat config and the new nested `image` section are functionally equivalent. + + + + + + Opt-in live coverage exists for the bundled plugin: + + ```bash + OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts + ``` + + The live test skips individual image, video, or music cases unless the matching Comfy workflow section is configured. + + + ## Related -- [Image Generation](/tools/image-generation) -- [Video Generation](/tools/video-generation) -- [Music Generation](/tools/music-generation) -- [Provider Directory](/providers/index) -- [Configuration Reference](/gateway/configuration-reference#agent-defaults) + + + Image generation tool configuration and usage. + + + Video generation tool configuration and usage. + + + Music and audio generation tool setup. + + + Overview of all providers and model refs. + + + Full config reference including agent defaults. + + diff --git a/docs/providers/huggingface.md b/docs/providers/huggingface.md index 13a6c883bf6..4fa721f3825 100644 --- a/docs/providers/huggingface.md +++ b/docs/providers/huggingface.md @@ -15,29 +15,49 @@ title: "Hugging Face (Inference)" - API: OpenAI-compatible (`https://router.huggingface.co/v1`) - Billing: Single HF token; [pricing](https://huggingface.co/docs/inference-providers/pricing) follows provider rates with a free tier. -## Quick start +## Getting started -1. Create a fine-grained token at [Hugging Face → Settings → Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) with the **Make calls to Inference Providers** permission. -2. Run onboarding and choose **Hugging Face** in the provider dropdown, then enter your API key when prompted: + + + Go to [Hugging Face Settings Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) and create a new fine-grained token. -```bash -openclaw onboard --auth-choice huggingface-api-key -``` + + The token must have the **Make calls to Inference Providers** permission enabled or API requests will be rejected. + -3. In the **Default Hugging Face model** dropdown, pick the model you want (the list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown). Your choice is saved as the default model. -4. You can also set or change the default model later in config: + + + Choose **Hugging Face** in the provider dropdown, then enter your API key when prompted: -```json5 -{ - agents: { - defaults: { - model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" }, - }, - }, -} -``` + ```bash + openclaw onboard --auth-choice huggingface-api-key + ``` -## Non-interactive example + + + In the **Default Hugging Face model** dropdown, pick the model you want. The list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown. Your choice is saved as the default model. + + You can also set or change the default model later in config: + + ```json5 + { + agents: { + defaults: { + model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" }, + }, + }, + } + ``` + + + + ```bash + openclaw models list --provider huggingface + ``` + + + +### Non-interactive setup ```bash openclaw onboard --non-interactive \ @@ -48,56 +68,10 @@ openclaw onboard --non-interactive \ This will set `huggingface/deepseek-ai/DeepSeek-R1` as the default model. -## Environment note - -If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN` -is available to that process (for example, in `~/.openclaw/.env` or via -`env.shellEnv`). - -## Model discovery and onboarding dropdown - -OpenClaw discovers models by calling the **Inference endpoint directly**: - -```bash -GET https://router.huggingface.co/v1/models -``` - -(Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`. - -When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During **interactive setup**, after you enter your token you see a **Default Hugging Face model** dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls **GET** `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used. - -## Model names and editable options - -- **Name from API:** The model display name is **hydrated from GET /v1/models** when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` → “DeepSeek R1”). -- **Override display name:** You can set a custom label per model in config so it appears the way you want in the CLI and UI: - -```json5 -{ - agents: { - defaults: { - models: { - "huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" }, - "huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" }, - }, - }, - }, -} -``` - -- **Policy suffixes:** OpenClaw's bundled Hugging Face docs and helpers currently treat these two suffixes as the built-in policy variants: - - **`:fastest`** — highest throughput. - - **`:cheapest`** — lowest cost per output token. - - You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default provider order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order). - -- **Config merge:** Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved. - -## Model IDs and configuration examples +## Model IDs Model refs use the form `huggingface//` (Hub-style IDs). The list below is from **GET** `https://router.huggingface.co/v1/models`; your catalog may include more. -**Example IDs (from the inference endpoint):** - | Model | Ref (prefix with `huggingface/`) | | ---------------------- | ----------------------------------- | | DeepSeek R1 | `deepseek-ai/DeepSeek-R1` | @@ -111,83 +85,153 @@ Model refs use the form `huggingface//` (Hub-style IDs). The list be | GLM 4.7 | `zai-org/GLM-4.7` | | Kimi K2.5 | `moonshotai/Kimi-K2.5` | -You can append `:fastest` or `:cheapest` to the model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and **GET** `https://router.huggingface.co/v1/models` for the full list. + +You can append `:fastest` or `:cheapest` to any model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and **GET** `https://router.huggingface.co/v1/models` for the full list. + -### Complete configuration examples +## Advanced details -**Primary DeepSeek R1 with Qwen fallback:** + + + OpenClaw discovers models by calling the **Inference endpoint directly**: -```json5 -{ - agents: { - defaults: { - model: { - primary: "huggingface/deepseek-ai/DeepSeek-R1", - fallbacks: ["huggingface/Qwen/Qwen3-8B"], + ```bash + GET https://router.huggingface.co/v1/models + ``` + + (Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`. + + When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During **interactive setup**, after you enter your token you see a **Default Hugging Face model** dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls **GET** `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used. + + + + + - **Name from API:** The model display name is **hydrated from GET /v1/models** when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` becomes "DeepSeek R1"). + - **Override display name:** You can set a custom label per model in config so it appears the way you want in the CLI and UI: + + ```json5 + { + agents: { + defaults: { + models: { + "huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" }, + "huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" }, + }, + }, }, - models: { - "huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" }, - "huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" }, + } + ``` + + - **Policy suffixes:** OpenClaw's bundled Hugging Face docs and helpers currently treat these two suffixes as the built-in policy variants: + - **`:fastest`** — highest throughput. + - **`:cheapest`** — lowest cost per output token. + + You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default provider order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order). + + - **Config merge:** Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved. + + + + + If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN` is available to that process (for example, in `~/.openclaw/.env` or via `env.shellEnv`). + + + OpenClaw accepts both `HUGGINGFACE_HUB_TOKEN` and `HF_TOKEN` as env var aliases. Either one works; if both are set, `HUGGINGFACE_HUB_TOKEN` takes precedence. + + + + + + ```json5 + { + agents: { + defaults: { + model: { + primary: "huggingface/deepseek-ai/DeepSeek-R1", + fallbacks: ["huggingface/Qwen/Qwen3-8B"], + }, + models: { + "huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" }, + "huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" }, + }, + }, }, - }, - }, -} -``` + } + ``` + -**Qwen as default, with :cheapest and :fastest variants:** - -```json5 -{ - agents: { - defaults: { - model: { primary: "huggingface/Qwen/Qwen3-8B" }, - models: { - "huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" }, - "huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" }, - "huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" }, + + ```json5 + { + agents: { + defaults: { + model: { primary: "huggingface/Qwen/Qwen3-8B" }, + models: { + "huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" }, + "huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" }, + "huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" }, + }, + }, }, - }, - }, -} -``` + } + ``` + -**DeepSeek + Llama + GPT-OSS with aliases:** - -```json5 -{ - agents: { - defaults: { - model: { - primary: "huggingface/deepseek-ai/DeepSeek-V3.2", - fallbacks: [ - "huggingface/meta-llama/Llama-3.3-70B-Instruct", - "huggingface/openai/gpt-oss-120b", - ], + + ```json5 + { + agents: { + defaults: { + model: { + primary: "huggingface/deepseek-ai/DeepSeek-V3.2", + fallbacks: [ + "huggingface/meta-llama/Llama-3.3-70B-Instruct", + "huggingface/openai/gpt-oss-120b", + ], + }, + models: { + "huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" }, + "huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" }, + "huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" }, + }, + }, }, - models: { - "huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" }, - "huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" }, - "huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" }, - }, - }, - }, -} -``` + } + ``` + -**Multiple Qwen and DeepSeek models with policy suffixes:** - -```json5 -{ - agents: { - defaults: { - model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" }, - models: { - "huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" }, - "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" }, - "huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" }, - "huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" }, + + ```json5 + { + agents: { + defaults: { + model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" }, + models: { + "huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" }, + "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" }, + "huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" }, + "huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" }, + }, + }, }, - }, - }, -} -``` + } + ``` + + + +## Related + + + + Overview of all providers, model refs, and failover behavior. + + + How to choose and configure models. + + + Official Hugging Face Inference Providers documentation. + + + Full config reference. + + diff --git a/docs/providers/inferrs.md b/docs/providers/inferrs.md index 21fcac16f52..8b843e78547 100644 --- a/docs/providers/inferrs.md +++ b/docs/providers/inferrs.md @@ -16,27 +16,27 @@ OpenAI-compatible `/v1` API. OpenClaw works with `inferrs` through the generic `inferrs` is currently best treated as a custom self-hosted OpenAI-compatible backend, not a dedicated OpenClaw provider plugin. -## Quick start +## Getting started -1. Start `inferrs` with a model. - -Example: - -```bash -inferrs serve google/gemma-4-E2B-it \ - --host 127.0.0.1 \ - --port 8080 \ - --device metal -``` - -2. Verify the server is reachable. - -```bash -curl http://127.0.0.1:8080/health -curl http://127.0.0.1:8080/v1/models -``` - -3. Add an explicit OpenClaw provider entry and point your default model at it. + + + ```bash + inferrs serve google/gemma-4-E2B-it \ + --host 127.0.0.1 \ + --port 8080 \ + --device metal + ``` + + + ```bash + curl http://127.0.0.1:8080/health + curl http://127.0.0.1:8080/v1/models + ``` + + + Add an explicit provider entry and point your default model at it. See the full config example below. + + ## Full config example @@ -81,93 +81,130 @@ This example uses Gemma 4 on a local `inferrs` server. } ``` -## Why `requiresStringContent` matters +## Advanced -Some `inferrs` Chat Completions routes accept only string -`messages[].content`, not structured content-part arrays. + + + Some `inferrs` Chat Completions routes accept only string + `messages[].content`, not structured content-part arrays. -If OpenClaw runs fail with an error like: + + If OpenClaw runs fail with an error like: -```text -messages[1].content: invalid type: sequence, expected a string -``` + ```text + messages[1].content: invalid type: sequence, expected a string + ``` -set: + set `compat.requiresStringContent: true` in your model entry. + -```json5 -compat: { - requiresStringContent: true -} -``` + ```json5 + compat: { + requiresStringContent: true + } + ``` -OpenClaw will flatten pure text content parts into plain strings before sending -the request. + OpenClaw will flatten pure text content parts into plain strings before sending + the request. -## Gemma and tool-schema caveat + -Some current `inferrs` + Gemma combinations accept small direct -`/v1/chat/completions` requests but still fail on full OpenClaw agent-runtime -turns. + + Some current `inferrs` + Gemma combinations accept small direct + `/v1/chat/completions` requests but still fail on full OpenClaw agent-runtime + turns. -If that happens, try this first: + If that happens, try this first: -```json5 -compat: { - requiresStringContent: true, - supportsTools: false -} -``` + ```json5 + compat: { + requiresStringContent: true, + supportsTools: false + } + ``` -That disables OpenClaw's tool schema surface for the model and can reduce prompt -pressure on stricter local backends. + That disables OpenClaw's tool schema surface for the model and can reduce prompt + pressure on stricter local backends. -If tiny direct requests still work but normal OpenClaw agent turns continue to -crash inside `inferrs`, the remaining issue is usually upstream model/server -behavior rather than OpenClaw's transport layer. + If tiny direct requests still work but normal OpenClaw agent turns continue to + crash inside `inferrs`, the remaining issue is usually upstream model/server + behavior rather than OpenClaw's transport layer. -## Manual smoke test + -Once configured, test both layers: + + Once configured, test both layers: -```bash -curl http://127.0.0.1:8080/v1/chat/completions \ - -H 'content-type: application/json' \ - -d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}' + ```bash + curl http://127.0.0.1:8080/v1/chat/completions \ + -H 'content-type: application/json' \ + -d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}' + ``` -openclaw infer model run \ - --model inferrs/google/gemma-4-E2B-it \ - --prompt "What is 2 + 2? Reply with one short sentence." \ - --json -``` + ```bash + openclaw infer model run \ + --model inferrs/google/gemma-4-E2B-it \ + --prompt "What is 2 + 2? Reply with one short sentence." \ + --json + ``` -If the first command works but the second fails, use the troubleshooting notes -below. + If the first command works but the second fails, check the troubleshooting section below. + + + + + `inferrs` is treated as a proxy-style OpenAI-compatible `/v1` backend, not a + native OpenAI endpoint. + + - Native OpenAI-only request shaping does not apply here + - No `service_tier`, no Responses `store`, no prompt-cache hints, and no + OpenAI reasoning-compat payload shaping + - Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) + are not injected on custom `inferrs` base URLs + + + ## Troubleshooting -- `curl /v1/models` fails: `inferrs` is not running, not reachable, or not - bound to the expected host/port. -- `messages[].content ... expected a string`: set - `compat.requiresStringContent: true`. -- Direct tiny `/v1/chat/completions` calls pass, but `openclaw infer model run` - fails: try `compat.supportsTools: false`. -- OpenClaw no longer gets schema errors, but `inferrs` still crashes on larger - agent turns: treat it as an upstream `inferrs` or model limitation and reduce - prompt pressure or switch local backend/model. + + + `inferrs` is not running, not reachable, or not bound to the expected + host/port. Make sure the server is started and listening on the address you + configured. + -## Proxy-style behavior + + Set `compat.requiresStringContent: true` in the model entry. See the + `requiresStringContent` section above for details. + -`inferrs` is treated as a proxy-style OpenAI-compatible `/v1` backend, not a -native OpenAI endpoint. + + Try setting `compat.supportsTools: false` to disable the tool schema surface. + See the Gemma tool-schema caveat above. + -- native OpenAI-only request shaping does not apply here -- no `service_tier`, no Responses `store`, no prompt-cache hints, and no - OpenAI reasoning-compat payload shaping -- hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) - are not injected on custom `inferrs` base URLs + + If OpenClaw no longer gets schema errors but `inferrs` still crashes on larger + agent turns, treat it as an upstream `inferrs` or model limitation. Reduce + prompt pressure or switch to a different local backend or model. + + + + +For general help, see [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq). + ## See also -- [Local models](/gateway/local-models) -- [Gateway troubleshooting](/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail) -- [Model providers](/concepts/model-providers) + + + Running OpenClaw against local model servers. + + + Debugging local OpenAI-compatible backends that pass probes but fail agent runs. + + + Overview of all providers, model refs, and failover behavior. + + diff --git a/docs/providers/moonshot.md b/docs/providers/moonshot.md index 4ae637f835c..b144e674d41 100644 --- a/docs/providers/moonshot.md +++ b/docs/providers/moonshot.md @@ -13,138 +13,215 @@ Moonshot provides the Kimi API with OpenAI-compatible endpoints. Configure the provider and set the default model to `moonshot/kimi-k2.5`, or use Kimi Coding with `kimi/kimi-code`. -Current Kimi K2 model IDs: + +Moonshot and Kimi Coding are **separate providers**. Keys are not interchangeable, endpoints differ, and model refs differ (`moonshot/...` vs `kimi/...`). + + +## Built-in model catalog [//]: # "moonshot-kimi-k2-ids:start" -- `kimi-k2.5` -- `kimi-k2-thinking` -- `kimi-k2-thinking-turbo` -- `kimi-k2-turbo` +| Model ref | Name | Reasoning | Input | Context | Max output | +| --------------------------------- | ---------------------- | --------- | ----------- | ------- | ---------- | +| `moonshot/kimi-k2.5` | Kimi K2.5 | No | text, image | 262,144 | 262,144 | +| `moonshot/kimi-k2-thinking` | Kimi K2 Thinking | Yes | text | 262,144 | 262,144 | +| `moonshot/kimi-k2-thinking-turbo` | Kimi K2 Thinking Turbo | Yes | text | 262,144 | 262,144 | +| `moonshot/kimi-k2-turbo` | Kimi K2 Turbo | No | text | 256,000 | 16,384 | [//]: # "moonshot-kimi-k2-ids:end" -```bash -openclaw onboard --auth-choice moonshot-api-key -# or -openclaw onboard --auth-choice moonshot-api-key-cn -``` +## Getting started -Kimi Coding: +Choose your provider and follow the setup steps. -```bash -openclaw onboard --auth-choice kimi-code-api-key -``` + + + **Best for:** Kimi K2 models via the Moonshot Open Platform. -Note: Moonshot and Kimi Coding are separate providers. Keys are not interchangeable, endpoints differ, and model refs differ (Moonshot uses `moonshot/...`, Kimi Coding uses `kimi/...`). + + + | Auth choice | Endpoint | Region | + | ---------------------- | ------------------------------ | ------------- | + | `moonshot-api-key` | `https://api.moonshot.ai/v1` | International | + | `moonshot-api-key-cn` | `https://api.moonshot.cn/v1` | China | + + + ```bash + openclaw onboard --auth-choice moonshot-api-key + ``` -Kimi web search uses the Moonshot plugin too: + Or for the China endpoint: -```bash -openclaw configure --section web -``` + ```bash + openclaw onboard --auth-choice moonshot-api-key-cn + ``` + + + ```json5 + { + agents: { + defaults: { + model: { primary: "moonshot/kimi-k2.5" }, + }, + }, + } + ``` + + + ```bash + openclaw models list --provider moonshot + ``` + + -Choose **Kimi** in the web-search section to store -`plugins.entries.moonshot.config.webSearch.*`. + ### Config example -## Config snippet (Moonshot API) - -```json5 -{ - env: { MOONSHOT_API_KEY: "sk-..." }, - agents: { - defaults: { - model: { primary: "moonshot/kimi-k2.5" }, + ```json5 + { + env: { MOONSHOT_API_KEY: "sk-..." }, + agents: { + defaults: { + model: { primary: "moonshot/kimi-k2.5" }, + models: { + // moonshot-kimi-k2-aliases:start + "moonshot/kimi-k2.5": { alias: "Kimi K2.5" }, + "moonshot/kimi-k2-thinking": { alias: "Kimi K2 Thinking" }, + "moonshot/kimi-k2-thinking-turbo": { alias: "Kimi K2 Thinking Turbo" }, + "moonshot/kimi-k2-turbo": { alias: "Kimi K2 Turbo" }, + // moonshot-kimi-k2-aliases:end + }, + }, + }, models: { - // moonshot-kimi-k2-aliases:start - "moonshot/kimi-k2.5": { alias: "Kimi K2.5" }, - "moonshot/kimi-k2-thinking": { alias: "Kimi K2 Thinking" }, - "moonshot/kimi-k2-thinking-turbo": { alias: "Kimi K2 Thinking Turbo" }, - "moonshot/kimi-k2-turbo": { alias: "Kimi K2 Turbo" }, - // moonshot-kimi-k2-aliases:end + mode: "merge", + providers: { + moonshot: { + baseUrl: "https://api.moonshot.ai/v1", + apiKey: "${MOONSHOT_API_KEY}", + api: "openai-completions", + models: [ + // moonshot-kimi-k2-models:start + { + id: "kimi-k2.5", + name: "Kimi K2.5", + reasoning: false, + input: ["text", "image"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 262144, + maxTokens: 262144, + }, + { + id: "kimi-k2-thinking", + name: "Kimi K2 Thinking", + reasoning: true, + input: ["text"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 262144, + maxTokens: 262144, + }, + { + id: "kimi-k2-thinking-turbo", + name: "Kimi K2 Thinking Turbo", + reasoning: true, + input: ["text"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 262144, + maxTokens: 262144, + }, + { + id: "kimi-k2-turbo", + name: "Kimi K2 Turbo", + reasoning: false, + input: ["text"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 256000, + maxTokens: 16384, + }, + // moonshot-kimi-k2-models:end + ], + }, + }, }, - }, - }, - models: { - mode: "merge", - providers: { - moonshot: { - baseUrl: "https://api.moonshot.ai/v1", - apiKey: "${MOONSHOT_API_KEY}", - api: "openai-completions", - models: [ - // moonshot-kimi-k2-models:start - { - id: "kimi-k2.5", - name: "Kimi K2.5", - reasoning: false, - input: ["text", "image"], - cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, - contextWindow: 262144, - maxTokens: 262144, - }, - { - id: "kimi-k2-thinking", - name: "Kimi K2 Thinking", - reasoning: true, - input: ["text"], - cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, - contextWindow: 262144, - maxTokens: 262144, - }, - { - id: "kimi-k2-thinking-turbo", - name: "Kimi K2 Thinking Turbo", - reasoning: true, - input: ["text"], - cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, - contextWindow: 262144, - maxTokens: 262144, - }, - { - id: "kimi-k2-turbo", - name: "Kimi K2 Turbo", - reasoning: false, - input: ["text"], - cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, - contextWindow: 256000, - maxTokens: 16384, - }, - // moonshot-kimi-k2-models:end - ], - }, - }, - }, -} -``` + } + ``` -## Kimi Coding + -```json5 -{ - env: { KIMI_API_KEY: "sk-..." }, - agents: { - defaults: { - model: { primary: "kimi/kimi-code" }, - models: { - "kimi/kimi-code": { alias: "Kimi" }, + + **Best for:** code-focused tasks via the Kimi Coding endpoint. + + + Kimi Coding uses a different API key and provider prefix (`kimi/...`) than Moonshot (`moonshot/...`). Legacy model ref `kimi/k2p5` remains accepted as a compatibility id. + + + + + ```bash + openclaw onboard --auth-choice kimi-code-api-key + ``` + + + ```json5 + { + agents: { + defaults: { + model: { primary: "kimi/kimi-code" }, + }, + }, + } + ``` + + + ```bash + openclaw models list --provider kimi + ``` + + + + ### Config example + + ```json5 + { + env: { KIMI_API_KEY: "sk-..." }, + agents: { + defaults: { + model: { primary: "kimi/kimi-code" }, + models: { + "kimi/kimi-code": { alias: "Kimi" }, + }, + }, }, - }, - }, -} -``` + } + ``` + + + ## Kimi web search OpenClaw also ships **Kimi** as a `web_search` provider, backed by Moonshot web search. -Interactive setup can prompt for: + + + ```bash + openclaw configure --section web + ``` -- the Moonshot API region: - - `https://api.moonshot.ai/v1` - - `https://api.moonshot.cn/v1` -- the default Kimi web-search model (defaults to `kimi-k2.5`) + Choose **Kimi** in the web-search section to store + `plugins.entries.moonshot.config.webSearch.*`. + + + + Interactive setup prompts for: + + | Setting | Options | + | ------------------- | -------------------------------------------------------------------- | + | API region | `https://api.moonshot.ai/v1` (international) or `https://api.moonshot.cn/v1` (China) | + | Web search model | Defaults to `kimi-k2.5` | + + + Config lives under `plugins.entries.moonshot.config.webSearch`: @@ -173,52 +250,82 @@ Config lives under `plugins.entries.moonshot.config.webSearch`: } ``` -## Notes +## Advanced -- Moonshot model refs use `moonshot/`. Kimi Coding model refs use `kimi/`. -- Current Kimi Coding default model ref is `kimi/kimi-code`. Legacy `kimi/k2p5` remains accepted as a compatibility model id. -- Kimi web search uses `KIMI_API_KEY` or `MOONSHOT_API_KEY`, and defaults to `https://api.moonshot.ai/v1` with model `kimi-k2.5`. -- Native Moonshot endpoints (`https://api.moonshot.ai/v1` and - `https://api.moonshot.cn/v1`) advertise streaming usage compatibility on the - shared `openai-completions` transport. OpenClaw now keys that off endpoint - capabilities, so compatible custom provider ids targeting the same native - Moonshot hosts inherit the same streaming-usage behavior. -- Override pricing and context metadata in `models.providers` if needed. -- If Moonshot publishes different context limits for a model, adjust - `contextWindow` accordingly. -- Use `https://api.moonshot.ai/v1` for the international endpoint, and `https://api.moonshot.cn/v1` for the China endpoint. -- Onboarding choices: - - `moonshot-api-key` for `https://api.moonshot.ai/v1` - - `moonshot-api-key-cn` for `https://api.moonshot.cn/v1` + + + Moonshot Kimi supports binary native thinking: -## Native thinking mode (Moonshot) + - `thinking: { type: "enabled" }` + - `thinking: { type: "disabled" }` -Moonshot Kimi supports binary native thinking: + Configure it per model via `agents.defaults.models..params`: -- `thinking: { type: "enabled" }` -- `thinking: { type: "disabled" }` - -Configure it per model via `agents.defaults.models..params`: - -```json5 -{ - agents: { - defaults: { - models: { - "moonshot/kimi-k2.5": { - params: { - thinking: { type: "disabled" }, + ```json5 + { + agents: { + defaults: { + models: { + "moonshot/kimi-k2.5": { + params: { + thinking: { type: "disabled" }, + }, + }, }, }, }, - }, - }, -} -``` + } + ``` -OpenClaw also maps runtime `/think` levels for Moonshot: + OpenClaw also maps runtime `/think` levels for Moonshot: -- `/think off` -> `thinking.type=disabled` -- any non-off thinking level -> `thinking.type=enabled` + | `/think` level | Moonshot behavior | + | -------------------- | -------------------------- | + | `/think off` | `thinking.type=disabled` | + | Any non-off level | `thinking.type=enabled` | -When Moonshot thinking is enabled, `tool_choice` must be `auto` or `none`. OpenClaw normalizes incompatible `tool_choice` values to `auto` for compatibility. + + When Moonshot thinking is enabled, `tool_choice` must be `auto` or `none`. OpenClaw normalizes incompatible `tool_choice` values to `auto` for compatibility. + + + + + + Native Moonshot endpoints (`https://api.moonshot.ai/v1` and + `https://api.moonshot.cn/v1`) advertise streaming usage compatibility on the + shared `openai-completions` transport. OpenClaw keys that off endpoint + capabilities, so compatible custom provider ids targeting the same native + Moonshot hosts inherit the same streaming-usage behavior. + + + + | Provider | Model ref prefix | Endpoint | Auth env var | + | ---------- | ---------------- | ----------------------------- | ------------------- | + | Moonshot | `moonshot/` | `https://api.moonshot.ai/v1` | `MOONSHOT_API_KEY` | + | Moonshot CN| `moonshot/` | `https://api.moonshot.cn/v1` | `MOONSHOT_API_KEY` | + | Kimi Coding| `kimi/` | Kimi Coding endpoint | `KIMI_API_KEY` | + | Web search | N/A | Same as Moonshot API region | `KIMI_API_KEY` or `MOONSHOT_API_KEY` | + + - Kimi web search uses `KIMI_API_KEY` or `MOONSHOT_API_KEY`, and defaults to `https://api.moonshot.ai/v1` with model `kimi-k2.5`. + - Override pricing and context metadata in `models.providers` if needed. + - If Moonshot publishes different context limits for a model, adjust `contextWindow` accordingly. + + + + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Configuring web search providers including Kimi. + + + Full config schema for providers, models, and plugins. + + + Moonshot API key management and documentation. + + diff --git a/docs/providers/qwen.md b/docs/providers/qwen.md index d5d890449cd..669eef26624 100644 --- a/docs/providers/qwen.md +++ b/docs/providers/qwen.md @@ -17,8 +17,6 @@ background. -## Recommended: Qwen Cloud - OpenClaw now treats Qwen as a first-class bundled provider with canonical id `qwen`. The bundled provider targets the Qwen Cloud / Alibaba DashScope and Coding Plan endpoints and keeps legacy `modelstudio` ids working as a @@ -29,38 +27,108 @@ compatibility alias. - Also accepted for compatibility: `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY` - API style: OpenAI-compatible + If you want `qwen3.6-plus`, prefer the **Standard (pay-as-you-go)** endpoint. Coding Plan support can lag behind the public catalog. + -```bash -# Global Coding Plan endpoint -openclaw onboard --auth-choice qwen-api-key +## Getting started -# China Coding Plan endpoint -openclaw onboard --auth-choice qwen-api-key-cn +Choose your plan type and follow the setup steps. -# Global Standard (pay-as-you-go) endpoint -openclaw onboard --auth-choice qwen-standard-api-key + + + **Best for:** subscription-based access through the Qwen Coding Plan. -# China Standard (pay-as-you-go) endpoint -openclaw onboard --auth-choice qwen-standard-api-key-cn -``` + + + Create or copy an API key from [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys). + + + For the **Global** endpoint: -Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still -work as compatibility aliases, but new setup flows should prefer the canonical -`qwen-*` auth-choice ids and `qwen/...` model refs. + ```bash + openclaw onboard --auth-choice qwen-api-key + ``` -After onboarding, set a default model: + For the **China** endpoint: -```json5 -{ - agents: { - defaults: { - model: { primary: "qwen/qwen3.5-plus" }, - }, - }, -} -``` + ```bash + openclaw onboard --auth-choice qwen-api-key-cn + ``` + + + ```json5 + { + agents: { + defaults: { + model: { primary: "qwen/qwen3.5-plus" }, + }, + }, + } + ``` + + + ```bash + openclaw models list --provider qwen + ``` + + + + + Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still + work as compatibility aliases, but new setup flows should prefer the canonical + `qwen-*` auth-choice ids and `qwen/...` model refs. + + + + + + **Best for:** pay-as-you-go access through the Standard Model Studio endpoint, including models like `qwen3.6-plus` that may not be available on the Coding Plan. + + + + Create or copy an API key from [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys). + + + For the **Global** endpoint: + + ```bash + openclaw onboard --auth-choice qwen-standard-api-key + ``` + + For the **China** endpoint: + + ```bash + openclaw onboard --auth-choice qwen-standard-api-key-cn + ``` + + + ```json5 + { + agents: { + defaults: { + model: { primary: "qwen/qwen3.5-plus" }, + }, + }, + } + ``` + + + ```bash + openclaw models list --provider qwen + ``` + + + + + Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still + work as compatibility aliases, but new setup flows should prefer the canonical + `qwen-*` auth-choice ids and `qwen/...` model refs. + + + + ## Plan types and endpoints @@ -75,16 +143,10 @@ The provider auto-selects the endpoint based on your auth choice. Canonical choices use the `qwen-*` family; `modelstudio-*` remains compatibility-only. You can override with a custom `baseUrl` in config. -Native Model Studio endpoints advertise streaming usage compatibility on the -shared `openai-completions` transport. OpenClaw keys that off endpoint -capabilities now, so DashScope-compatible custom provider ids targeting the -same native hosts inherit the same streaming-usage behavior instead of -requiring the built-in `qwen` provider id specifically. - -## Get your API key - -- **Manage keys**: [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys) -- **Docs**: [docs.qwencloud.com](https://docs.qwencloud.com/developer-guides/getting-started/introduction) + +**Manage keys:** [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys) | +**Docs:** [docs.qwencloud.com](https://docs.qwencloud.com/developer-guides/getting-started/introduction) + ## Built-in catalog @@ -104,71 +166,20 @@ the Standard endpoint. | `qwen/glm-4.7` | text | 202,752 | GLM | | `qwen/kimi-k2.5` | text, image | 262,144 | Moonshot AI via Alibaba | + Availability can still vary by endpoint and billing plan even when a model is present in the bundled catalog. - -Native-streaming usage compatibility applies to both the Coding Plan hosts and -the Standard DashScope-compatible hosts: - -- `https://coding.dashscope.aliyuncs.com/v1` -- `https://coding-intl.dashscope.aliyuncs.com/v1` -- `https://dashscope.aliyuncs.com/compatible-mode/v1` -- `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` - -## Qwen 3.6 Plus availability - -`qwen3.6-plus` is available on the Standard (pay-as-you-go) Model Studio -endpoints: - -- China: `dashscope.aliyuncs.com/compatible-mode/v1` -- Global: `dashscope-intl.aliyuncs.com/compatible-mode/v1` - -If the Coding Plan endpoints return an "unsupported model" error for -`qwen3.6-plus`, switch to Standard (pay-as-you-go) instead of the Coding Plan -endpoint/key pair. - -## Capability plan - -The `qwen` extension is being positioned as the vendor home for the full Qwen -Cloud surface, not just coding/text models. - -- Text/chat models: bundled now -- Tool calling, structured output, thinking: inherited from the OpenAI-compatible transport -- Image generation: planned at the provider-plugin layer -- Image/video understanding: bundled now on the Standard endpoint -- Speech/audio: planned at the provider-plugin layer -- Memory embeddings/reranking: planned through the embedding adapter surface -- Video generation: bundled now through the shared video-generation capability + ## Multimodal add-ons -The `qwen` extension now also exposes: +The `qwen` extension also exposes multimodal capabilities on the **Standard** +DashScope endpoints (not the Coding Plan endpoints): -- Video understanding via `qwen-vl-max-latest` -- Wan video generation via: - - `wan2.6-t2v` (default) - - `wan2.6-i2v` - - `wan2.6-r2v` - - `wan2.6-r2v-flash` - - `wan2.7-r2v` +- **Video understanding** via `qwen-vl-max-latest` +- **Wan video generation** via `wan2.6-t2v` (default), `wan2.6-i2v`, `wan2.6-r2v`, `wan2.6-r2v-flash`, `wan2.7-r2v` -These multimodal surfaces use the **Standard** DashScope endpoints, not the -Coding Plan endpoints. - -- Global/Intl Standard base URL: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` -- China Standard base URL: `https://dashscope.aliyuncs.com/compatible-mode/v1` - -For video generation, OpenClaw maps the configured Qwen region to the matching -DashScope AIGC host before submitting the job: - -- Global/Intl: `https://dashscope-intl.aliyuncs.com` -- China: `https://dashscope.aliyuncs.com` - -That means a normal `models.providers.qwen.baseUrl` pointing at either the -Coding Plan or Standard Qwen hosts still keeps video generation on the correct -regional DashScope video endpoint. - -For video generation, set a default model explicitly: +To use Qwen as the default video provider: ```json5 { @@ -180,22 +191,110 @@ For video generation, set a default model explicitly: } ``` -Current bundled Qwen video-generation limits: + +See [Video Generation](/tools/video-generation) for shared tool parameters, provider selection, and failover behavior. + -- Up to **1** output video per request -- Up to **1** input image -- Up to **4** input videos -- Up to **10 seconds** duration -- Supports `size`, `aspectRatio`, `resolution`, `audio`, and `watermark` -- Reference image/video mode currently requires **remote http(s) URLs**. Local - file paths are rejected up front because the DashScope video endpoint does not - accept uploaded local buffers for those references. +## Advanced -See [Video Generation](/tools/video-generation) for the shared tool -parameters, provider selection, and failover behavior. + + + `qwen3.6-plus` is available on the Standard (pay-as-you-go) Model Studio + endpoints: -## Environment note + - China: `dashscope.aliyuncs.com/compatible-mode/v1` + - Global: `dashscope-intl.aliyuncs.com/compatible-mode/v1` -If the Gateway runs as a daemon (launchd/systemd), make sure `QWEN_API_KEY` is -available to that process (for example, in `~/.openclaw/.env` or via -`env.shellEnv`). + If the Coding Plan endpoints return an "unsupported model" error for + `qwen3.6-plus`, switch to Standard (pay-as-you-go) instead of the Coding Plan + endpoint/key pair. + + + + + The `qwen` extension is being positioned as the vendor home for the full Qwen + Cloud surface, not just coding/text models. + + - **Text/chat models:** bundled now + - **Tool calling, structured output, thinking:** inherited from the OpenAI-compatible transport + - **Image generation:** planned at the provider-plugin layer + - **Image/video understanding:** bundled now on the Standard endpoint + - **Speech/audio:** planned at the provider-plugin layer + - **Memory embeddings/reranking:** planned through the embedding adapter surface + - **Video generation:** bundled now through the shared video-generation capability + + + + + For video generation, OpenClaw maps the configured Qwen region to the matching + DashScope AIGC host before submitting the job: + + - Global/Intl: `https://dashscope-intl.aliyuncs.com` + - China: `https://dashscope.aliyuncs.com` + + That means a normal `models.providers.qwen.baseUrl` pointing at either the + Coding Plan or Standard Qwen hosts still keeps video generation on the correct + regional DashScope video endpoint. + + Current bundled Qwen video-generation limits: + + - Up to **1** output video per request + - Up to **1** input image + - Up to **4** input videos + - Up to **10 seconds** duration + - Supports `size`, `aspectRatio`, `resolution`, `audio`, and `watermark` + - Reference image/video mode currently requires **remote http(s) URLs**. Local + file paths are rejected up front because the DashScope video endpoint does not + accept uploaded local buffers for those references. + + + + + Native Model Studio endpoints advertise streaming usage compatibility on the + shared `openai-completions` transport. OpenClaw keys that off endpoint + capabilities now, so DashScope-compatible custom provider ids targeting the + same native hosts inherit the same streaming-usage behavior instead of + requiring the built-in `qwen` provider id specifically. + + Native-streaming usage compatibility applies to both the Coding Plan hosts and + the Standard DashScope-compatible hosts: + + - `https://coding.dashscope.aliyuncs.com/v1` + - `https://coding-intl.dashscope.aliyuncs.com/v1` + - `https://dashscope.aliyuncs.com/compatible-mode/v1` + - `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` + + + + + Multimodal surfaces (video understanding and Wan video generation) use the + **Standard** DashScope endpoints, not the Coding Plan endpoints: + + - Global/Intl Standard base URL: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1` + - China Standard base URL: `https://dashscope.aliyuncs.com/compatible-mode/v1` + + + + + If the Gateway runs as a daemon (launchd/systemd), make sure `QWEN_API_KEY` is + available to that process (for example, in `~/.openclaw/.env` or via + `env.shellEnv`). + + + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Shared video tool parameters and provider selection. + + + Legacy ModelStudio provider and migration notes. + + + General troubleshooting and FAQ. + +