diff --git a/docs/channels/index.md b/docs/channels/index.md index 0fb5d6fa321..76265dd6c85 100644 --- a/docs/channels/index.md +++ b/docs/channels/index.md @@ -21,32 +21,32 @@ Text is supported everywhere; media and reactions vary by channel. ## Supported channels -- [BlueBubbles](/channels/bluebubbles) — **Recommended for iMessage**; uses the BlueBubbles macOS server REST API with full feature support (bundled plugin; edit, unsend, effects, reactions, group management — edit currently broken on macOS 26 Tahoe). -- [Discord](/channels/discord) — Discord Bot API + Gateway; supports servers, channels, and DMs. -- [Feishu](/channels/feishu) — Feishu/Lark bot via WebSocket (bundled plugin). -- [Google Chat](/channels/googlechat) — Google Chat API app via HTTP webhook (downloadable plugin). -- [iMessage (legacy)](/channels/imessage) — Legacy macOS integration via imsg CLI (deprecated, use BlueBubbles for new setups). -- [IRC](/channels/irc) — Classic IRC servers; channels + DMs with pairing/allowlist controls. -- [LINE](/channels/line) — LINE Messaging API bot (downloadable plugin). -- [Matrix](/channels/matrix) — Matrix protocol (downloadable plugin). -- [Mattermost](/channels/mattermost) — Bot API + WebSocket; channels, groups, DMs (downloadable plugin). -- [Microsoft Teams](/channels/msteams) — Bot Framework; enterprise support (bundled plugin). -- [Nextcloud Talk](/channels/nextcloud-talk) — Self-hosted chat via Nextcloud Talk (bundled plugin). -- [Nostr](/channels/nostr) — Decentralized DMs via NIP-04 (bundled plugin). -- [QQ Bot](/channels/qqbot) — QQ Bot API; private chat, group chat, and rich media (bundled plugin). -- [Signal](/channels/signal) — signal-cli; privacy-focused. -- [Slack](/channels/slack) — Bolt SDK; workspace apps. -- [Synology Chat](/channels/synology-chat) — Synology NAS Chat via outgoing+incoming webhooks (bundled plugin). -- [Telegram](/channels/telegram) — Bot API via grammY; supports groups. -- [Tlon](/channels/tlon) — Urbit-based messenger (bundled plugin). -- [Twitch](/channels/twitch) — Twitch chat via IRC connection (bundled plugin). -- [Voice Call](/plugins/voice-call) — Telephony via Plivo or Twilio (plugin, installed separately). -- [WebChat](/web/webchat) — Gateway WebChat UI over WebSocket. -- [WeChat](/channels/wechat) — Tencent iLink Bot plugin via QR login; private chats only (external plugin). -- [WhatsApp](/channels/whatsapp) — Most popular; uses Baileys and requires QR pairing. -- [Yuanbao](/channels/yuanbao) — Tencent Yuanbao bot (external plugin). -- [Zalo](/channels/zalo) — Zalo Bot API; Vietnam's popular messenger (bundled plugin). -- [Zalo Personal](/channels/zalouser) — Zalo personal account via QR login (bundled plugin). +- [BlueBubbles](/channels/bluebubbles) - **Recommended for iMessage**; uses the BlueBubbles macOS server REST API with full feature support (bundled plugin; edit, unsend, effects, reactions, group management - edit currently broken on macOS 26 Tahoe). +- [Discord](/channels/discord) - Discord Bot API + Gateway; supports servers, channels, and DMs. +- [Feishu](/channels/feishu) - Feishu/Lark bot via WebSocket (bundled plugin). +- [Google Chat](/channels/googlechat) - Google Chat API app via HTTP webhook (downloadable plugin). +- [iMessage (legacy)](/channels/imessage) - Legacy macOS integration via imsg CLI (deprecated, use BlueBubbles for new setups). +- [IRC](/channels/irc) - Classic IRC servers; channels + DMs with pairing/allowlist controls. +- [LINE](/channels/line) - LINE Messaging API bot (downloadable plugin). +- [Matrix](/channels/matrix) - Matrix protocol (downloadable plugin). +- [Mattermost](/channels/mattermost) - Bot API + WebSocket; channels, groups, DMs (downloadable plugin). +- [Microsoft Teams](/channels/msteams) - Bot Framework; enterprise support (bundled plugin). +- [Nextcloud Talk](/channels/nextcloud-talk) - Self-hosted chat via Nextcloud Talk (bundled plugin). +- [Nostr](/channels/nostr) - Decentralized DMs via NIP-04 (bundled plugin). +- [QQ Bot](/channels/qqbot) - QQ Bot API; private chat, group chat, and rich media (bundled plugin). +- [Signal](/channels/signal) - signal-cli; privacy-focused. +- [Slack](/channels/slack) - Bolt SDK; workspace apps. +- [Synology Chat](/channels/synology-chat) - Synology NAS Chat via outgoing+incoming webhooks (bundled plugin). +- [Telegram](/channels/telegram) - Bot API via grammY; supports groups. +- [Tlon](/channels/tlon) - Urbit-based messenger (bundled plugin). +- [Twitch](/channels/twitch) - Twitch chat via IRC connection (bundled plugin). +- [Voice Call](/plugins/voice-call) - Telephony via Plivo or Twilio (plugin, installed separately). +- [WebChat](/web/webchat) - Gateway WebChat UI over WebSocket. +- [WeChat](/channels/wechat) - Tencent iLink Bot plugin via QR login; private chats only (external plugin). +- [WhatsApp](/channels/whatsapp) - Most popular; uses Baileys and requires QR pairing. +- [Yuanbao](/channels/yuanbao) - Tencent Yuanbao bot (external plugin). +- [Zalo](/channels/zalo) - Zalo Bot API; Vietnam's popular messenger (bundled plugin). +- [Zalo Personal](/channels/zalouser) - Zalo personal account via QR login (bundled plugin). ## Notes diff --git a/docs/help/testing-live.md b/docs/help/testing-live.md index ffaf18902cc..2314854a362 100644 --- a/docs/help/testing-live.md +++ b/docs/help/testing-live.md @@ -63,8 +63,8 @@ loopback/private fallbacks are rejected by design. Live tests are split into two layers so we can isolate failures: -- “Direct model” tells us the provider/model can answer at all with the given key. -- “Gateway smoke” tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.). +- "Direct model" tells us the provider/model can answer at all with the given key. +- "Gateway smoke" tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.). ### Layer 1: Direct model completion (no gateway) @@ -89,7 +89,7 @@ Live tests are split into two layers so we can isolate failures: - By default: profile store and env fallbacks - Set `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to enforce **profile store** only - Why this exists: - - Separates “provider API is broken / key is invalid” from “gateway agent pipeline is broken” + - Separates "provider API is broken / key is invalid" from "gateway agent pipeline is broken" - Contains small, isolated regressions (example: OpenAI Responses/Codex Responses reasoning replay + tool-call flows) ### Layer 2: Gateway + dev agent smoke (what "@openclaw" actually does) @@ -99,7 +99,7 @@ Live tests are split into two layers so we can isolate failures: - Spin up an in-process gateway - Create/patch a `agent:dev:*` session (model override per run) - Iterate models-with-keys and assert: - - “meaningful” response (no tools) + - "meaningful" response (no tools) - a real tool invocation works (read probe) - optional extra tool probes (exec+read probe) - OpenAI regression paths (tool-call-only → follow-up) keep working @@ -115,13 +115,13 @@ Live tests are split into two layers so we can isolate failures: - `OPENCLAW_LIVE_GATEWAY_MODELS=all` is an alias for the modern allowlist - Or set `OPENCLAW_LIVE_GATEWAY_MODELS="provider/model"` (or comma list) to narrow - Modern/all gateway sweeps default to a curated high-signal cap; set `OPENCLAW_LIVE_GATEWAY_MAX_MODELS=0` for an exhaustive modern sweep or a positive number for a smaller cap. -- How to select providers (avoid “OpenRouter everything”): +- How to select providers (avoid "OpenRouter everything"): - `OPENCLAW_LIVE_GATEWAY_PROVIDERS="google,google-antigravity,google-gemini-cli,openai,anthropic,zai,minimax"` (comma allowlist) - Tool + image probes are always on in this live test: - `read` probe + `exec+read` probe (tool stress) - image probe runs when the model advertises image input support - Flow (high level): - - Test generates a tiny PNG with “CAT” + random code (`src/gateway/live-image-probe.ts`) + - Test generates a tiny PNG with "CAT" + random code (`src/gateway/live-image-probe.ts`) - Sends it via `agent` `attachments: [{ mimeType: "image/png", content: "" }]` - Gateway parses attachments into `images[]` (`src/gateway/server-methods/agent.ts` + `src/gateway/chat-attachments.ts`) - Embedded agent forwards a multimodal user message to the model @@ -367,16 +367,16 @@ Notes: - `google-antigravity/...` uses the Antigravity OAuth bridge (Cloud Code Assist-style agent endpoint). - `google-gemini-cli/...` uses the local Gemini CLI on your machine (separate auth + tooling quirks). - Gemini API vs Gemini CLI: - - API: OpenClaw calls Google’s hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by “Gemini”. + - API: OpenClaw calls Google's hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by "Gemini". - CLI: OpenClaw shells out to a local `gemini` binary; it has its own auth and can behave differently (streaming/tool support/version skew). ## Live: model matrix (what we cover) -There is no fixed “CI model list” (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys. +There is no fixed "CI model list" (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys. ### Modern smoke set (tool calling + image) -This is the “common models” run we expect to keep working: +This is the "common models" run we expect to keep working: - OpenAI (non-Codex): `openai/gpt-5.5` - OpenAI Codex OAuth: `openai-codex/gpt-5.5` @@ -404,7 +404,7 @@ Pick at least one per provider family: Optional additional coverage (nice to have): - xAI: `xai/grok-4.3` (or latest available) -- Mistral: `mistral/`… (pick one “tools” capable model you have enabled) +- Mistral: `mistral/`… (pick one "tools" capable model you have enabled) - Cerebras: `cerebras/`… (if you have access) - LM Studio: `lmstudio/`… (local; tool calling depends on API mode) @@ -433,9 +433,9 @@ Do not hardcode "all models" in docs. The authoritative list is whatever `discov Live tests discover credentials the same way the CLI does. Practical implications: - If the CLI works, live tests should find the same keys. -- If a live test says “no creds”, debug the same way you’d debug `openclaw models list` / model selection. +- If a live test says "no creds", debug the same way you'd debug `openclaw models list` / model selection. -- Per-agent auth profiles: `~/.openclaw/agents//agent/auth-profiles.json` (this is what “profile keys” means in the live tests) +- Per-agent auth profiles: `~/.openclaw/agents//agent/auth-profiles.json` (this is what "profile keys" means in the live tests) - Config: `~/.openclaw/openclaw.json` (or `OPENCLAW_CONFIG_PATH`) - Legacy state dir: `~/.openclaw/credentials/` (copied into the staged live home when present, but not the main profile-key store) - Live local runs copy the active config, per-agent `auth-profiles.json` files, legacy `credentials/`, and supported external CLI auth dirs into a temp test home by default; staged live homes skip `workspace/` and `sandboxes/`, and `agents.*.workspace` / `agentDir` path overrides are stripped so probes stay off your real host workspace. @@ -584,4 +584,4 @@ request. Plugin dependencies are expected to be present before runtime load. ## Related -- [Testing](/help/testing) — unit, integration, QA, and Docker suites +- [Testing](/help/testing) - unit, integration, QA, and Docker suites diff --git a/docs/reference/AGENTS.default.md b/docs/reference/AGENTS.default.md index 8363a66c9a5..7dfe9e8cd83 100644 --- a/docs/reference/AGENTS.default.md +++ b/docs/reference/AGENTS.default.md @@ -6,13 +6,11 @@ read_when: - Enabling or auditing default skills --- -# AGENTS.md - OpenClaw Personal Assistant (default) - ## First run (recommended) OpenClaw uses a dedicated workspace directory for the agent. Default: `~/.openclaw/workspace` (configurable via `agents.defaults.workspace`). -1. Create the workspace (if it doesn’t already exist): +1. Create the workspace (if it doesn't already exist): ```bash mkdir -p ~/.openclaw/workspace @@ -42,9 +40,9 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md ## Safety defaults -- Don’t dump directories or secrets into chat. -- Don’t run destructive commands unless explicitly asked. -- Don’t send partial/streaming replies to external messaging surfaces (only final replies). +- Don't dump directories or secrets into chat. +- Don't run destructive commands unless explicitly asked. +- Don't send partial/streaming replies to external messaging surfaces (only final replies). ## Session start (required) @@ -60,8 +58,8 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md ## Shared spaces (recommended) -- You’re not the user’s voice; be careful in group chats or public channels. -- Don’t share private data, contact info, or internal notes. +- You're not the user's voice; be careful in group chats or public channels. +- Don't share private data, contact info, or internal notes. ## Memory system (recommended) @@ -74,12 +72,12 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md ## Tools and skills -- Tools live in skills; follow each skill’s `SKILL.md` when you need it. +- Tools live in skills; follow each skill's `SKILL.md` when you need it. - Keep environment-specific notes in `TOOLS.md` (Notes for Skills). ## Backup tip (recommended) -If you treat this workspace as Clawd’s “memory”, make it a git repo (ideally private) so `AGENTS.md` and your memory files are backed up. +If you treat this workspace as Clawd's "memory", make it a git repo (ideally private) so `AGENTS.md` and your memory files are backed up. ```bash cd ~/.openclaw/workspace @@ -97,30 +95,30 @@ git commit -m "Add Clawd workspace" ## Core skills (enable in Settings → Skills) -- **mcporter** — Tool server runtime/CLI for managing external skill backends. -- **Peekaboo** — Fast macOS screenshots with optional AI vision analysis. -- **camsnap** — Capture frames, clips, or motion alerts from RTSP/ONVIF security cams. -- **oracle** — OpenAI-ready agent CLI with session replay and browser control. -- **eightctl** — Control your sleep, from the terminal. -- **imsg** — Send, read, stream iMessage & SMS. -- **wacli** — WhatsApp CLI: sync, search, send. -- **discord** — Discord actions: react, stickers, polls. Use `user:` or `channel:` targets (bare numeric ids are ambiguous). -- **gog** — Google Suite CLI: Gmail, Calendar, Drive, Contacts. -- **spotify-player** — Terminal Spotify client to search/queue/control playback. -- **sag** — ElevenLabs speech with mac-style say UX; streams to speakers by default. -- **Sonos CLI** — Control Sonos speakers (discover/status/playback/volume/grouping) from scripts. -- **blucli** — Play, group, and automate BluOS players from scripts. -- **OpenHue CLI** — Philips Hue lighting control for scenes and automations. -- **OpenAI Whisper** — Local speech-to-text for quick dictation and voicemail transcripts. -- **Gemini CLI** — Google Gemini models from the terminal for fast Q&A. -- **agent-tools** — Utility toolkit for automations and helper scripts. +- **mcporter** - Tool server runtime/CLI for managing external skill backends. +- **Peekaboo** - Fast macOS screenshots with optional AI vision analysis. +- **camsnap** - Capture frames, clips, or motion alerts from RTSP/ONVIF security cams. +- **oracle** - OpenAI-ready agent CLI with session replay and browser control. +- **eightctl** - Control your sleep, from the terminal. +- **imsg** - Send, read, stream iMessage & SMS. +- **wacli** - WhatsApp CLI: sync, search, send. +- **discord** - Discord actions: react, stickers, polls. Use `user:` or `channel:` targets (bare numeric ids are ambiguous). +- **gog** - Google Suite CLI: Gmail, Calendar, Drive, Contacts. +- **spotify-player** - Terminal Spotify client to search/queue/control playback. +- **sag** - ElevenLabs speech with mac-style say UX; streams to speakers by default. +- **Sonos CLI** - Control Sonos speakers (discover/status/playback/volume/grouping) from scripts. +- **blucli** - Play, group, and automate BluOS players from scripts. +- **OpenHue CLI** - Philips Hue lighting control for scenes and automations. +- **OpenAI Whisper** - Local speech-to-text for quick dictation and voicemail transcripts. +- **Gemini CLI** - Google Gemini models from the terminal for fast Q&A. +- **agent-tools** - Utility toolkit for automations and helper scripts. ## Usage notes - Prefer the `openclaw` CLI for scripting; mac app handles permissions. - Run installs from the Skills tab; it hides the button if a binary is already present. - Keep heartbeats enabled so the assistant can schedule reminders, monitor inboxes, and trigger camera captures. -- Canvas UI runs full-screen with native overlays. Avoid placing critical controls in the top-left/top-right/bottom edges; add explicit gutters in the layout and don’t rely on safe-area insets. +- Canvas UI runs full-screen with native overlays. Avoid placing critical controls in the top-left/top-right/bottom edges; add explicit gutters in the layout and don't rely on safe-area insets. - For browser-driven verification, use `openclaw browser` (tabs/status/screenshot) with the OpenClaw-managed Chrome profile. - For DOM inspection, use `openclaw browser eval|query|dom|snapshot` (and `--json`/`--out` when you need machine output). - For interactions, use `openclaw browser click|type|hover|drag|select|upload|press|wait|navigate|back|evaluate|run` (click/type require snapshot refs; use `evaluate` for CSS selectors). diff --git a/docs/tools/image-generation.md b/docs/tools/image-generation.md index 74861ec6814..85c99e162d1 100644 --- a/docs/tools/image-generation.md +++ b/docs/tools/image-generation.md @@ -52,7 +52,7 @@ or sign in with OpenAI Codex OAuth. _"Generate an image of a friendly robot mascot."_ The agent calls `image_generate` automatically. No tool allow-listing - needed — it is enabled by default when a provider is available. + needed - it is enabled by default when a provider is available. @@ -110,10 +110,10 @@ Use `action: "list"` to inspect available providers and models at runtime: | Capability | ComfyUI | DeepInfra | fal | Google | MiniMax | OpenAI | Vydra | xAI | | --------------------- | ------------------ | --------- | ----------------- | -------------- | --------------------- | -------------- | ----- | -------------- | | Generate (max count) | Workflow-defined | 4 | 4 | 4 | 9 | 4 | 1 | 4 | -| Edit / reference | 1 image (workflow) | 1 image | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | — | Up to 5 images | -| Size control | — | ✓ | ✓ | ✓ | — | Up to 4K | — | — | -| Aspect ratio | — | — | ✓ (generate only) | ✓ | ✓ | — | — | ✓ | -| Resolution (1K/2K/4K) | — | — | ✓ | ✓ | — | — | — | 1K, 2K | +| Edit / reference | 1 image (workflow) | 1 image | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | - | Up to 5 images | +| Size control | - | ✓ | ✓ | ✓ | - | Up to 4K | - | - | +| Aspect ratio | - | - | ✓ (generate only) | ✓ | ✓ | - | - | ✓ | +| Resolution (1K/2K/4K) | - | - | ✓ | ✓ | - | - | - | 1K, 2K | ## Tool parameters @@ -150,7 +150,7 @@ Use `action: "list"` to inspect available providers and models at runtime: Background hint when the provider supports it. Use `transparent` with `outputFormat: "png"` or `"webp"` for transparency-capable providers. -Number of images to generate (1–4). +Number of images to generate (1-4). Optional provider request timeout in milliseconds. Output filename hint. @@ -196,7 +196,7 @@ OpenClaw tries providers in this order: 1. **`model` parameter** from the tool call (if the agent specifies one). 2. **`imageGenerationModel.primary`** from config. 3. **`imageGenerationModel.fallbacks`** in order. -4. **Auto-detection** — auth-backed provider defaults only: +4. **Auto-detection** - auth-backed provider defaults only: - current default provider first; - remaining registered image-generation providers in provider-id order. @@ -248,7 +248,7 @@ OpenAI, OpenRouter, Google, and xAI support up to 5 reference images via the image request through the Codex Responses backend. Legacy Codex base URLs such as `https://chatgpt.com/backend-api` are canonicalized to `https://chatgpt.com/backend-api/codex` for image requests. OpenClaw - does **not** silently fall back to `OPENAI_API_KEY` for that request — + does **not** silently fall back to `OPENAI_API_KEY` for that request - to force direct OpenAI Images API routing, configure `models.providers.openai` explicitly with an API key, custom base URL, or Azure endpoint. @@ -398,13 +398,13 @@ as ignored for them. ## Related -- [Tools overview](/tools) — all available agent tools -- [ComfyUI](/providers/comfy) — local ComfyUI and Comfy Cloud workflow setup -- [fal](/providers/fal) — fal image and video provider setup -- [Google (Gemini)](/providers/google) — Gemini image provider setup -- [MiniMax](/providers/minimax) — MiniMax image provider setup -- [OpenAI](/providers/openai) — OpenAI Images provider setup -- [Vydra](/providers/vydra) — Vydra image, video, and speech setup -- [xAI](/providers/xai) — Grok image, video, search, code execution, and TTS setup -- [Configuration reference](/gateway/config-agents#agent-defaults) — `imageGenerationModel` config -- [Models](/concepts/models) — model configuration and failover +- [Tools overview](/tools) - all available agent tools +- [ComfyUI](/providers/comfy) - local ComfyUI and Comfy Cloud workflow setup +- [fal](/providers/fal) - fal image and video provider setup +- [Google (Gemini)](/providers/google) - Gemini image provider setup +- [MiniMax](/providers/minimax) - MiniMax image provider setup +- [OpenAI](/providers/openai) - OpenAI Images provider setup +- [Vydra](/providers/vydra) - Vydra image, video, and speech setup +- [xAI](/providers/xai) - Grok image, video, search, code execution, and TTS setup +- [Configuration reference](/gateway/config-agents#agent-defaults) - `imageGenerationModel` config +- [Models](/concepts/models) - model configuration and failover diff --git a/docs/tools/video-generation.md b/docs/tools/video-generation.md index a4a73d566f8..47b1c64bc5e 100644 --- a/docs/tools/video-generation.md +++ b/docs/tools/video-generation.md @@ -22,9 +22,9 @@ provider API key or configure `agents.defaults.videoGenerationModel`. OpenClaw treats video generation as three runtime modes: -- `generate` — text-to-video requests with no reference media. -- `imageToVideo` — request includes one or more reference images. -- `videoToVideo` — request includes one or more reference videos. +- `generate` - text-to-video requests with no reference media. +- `imageToVideo` - request includes one or more reference images. +- `videoToVideo` - request includes one or more reference videos. Providers can support any subset of those modes. The tool validates the active mode before submission and reports supported modes in `action=list`. @@ -109,20 +109,20 @@ generation. | Provider | Default model | Text | Image ref | Video ref | Auth | | --------------------- | ------------------------------- | :--: | ---------------------------------------------------- | ----------------------------------------------- | ---------------------------------------- | | Alibaba | `wan2.6-t2v` | ✓ | Yes (remote URL) | Yes (remote URL) | `MODELSTUDIO_API_KEY` | -| BytePlus (1.0) | `seedance-1-0-pro-250528` | ✓ | Up to 2 images (I2V models only; first + last frame) | — | `BYTEPLUS_API_KEY` | -| BytePlus Seedance 1.5 | `seedance-1-5-pro-251215` | ✓ | Up to 2 images (first + last frame via role) | — | `BYTEPLUS_API_KEY` | +| BytePlus (1.0) | `seedance-1-0-pro-250528` | ✓ | Up to 2 images (I2V models only; first + last frame) | - | `BYTEPLUS_API_KEY` | +| BytePlus Seedance 1.5 | `seedance-1-5-pro-251215` | ✓ | Up to 2 images (first + last frame via role) | - | `BYTEPLUS_API_KEY` | | BytePlus Seedance 2.0 | `dreamina-seedance-2-0-260128` | ✓ | Up to 9 reference images | Up to 3 videos | `BYTEPLUS_API_KEY` | -| ComfyUI | `workflow` | ✓ | 1 image | — | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` | -| DeepInfra | `Pixverse/Pixverse-T2V` | ✓ | — | — | `DEEPINFRA_API_KEY` | +| ComfyUI | `workflow` | ✓ | 1 image | - | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` | +| DeepInfra | `Pixverse/Pixverse-T2V` | ✓ | - | - | `DEEPINFRA_API_KEY` | | fal | `fal-ai/minimax/video-01-live` | ✓ | 1 image; up to 9 with Seedance reference-to-video | Up to 3 videos with Seedance reference-to-video | `FAL_KEY` | | Google | `veo-3.1-fast-generate-preview` | ✓ | 1 image | 1 video | `GEMINI_API_KEY` | -| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | — | `MINIMAX_API_KEY` or MiniMax OAuth | +| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | - | `MINIMAX_API_KEY` or MiniMax OAuth | | OpenAI | `sora-2` | ✓ | 1 image | 1 video | `OPENAI_API_KEY` | -| OpenRouter | `google/veo-3.1-fast` | ✓ | Up to 4 images (first/last frame or references) | — | `OPENROUTER_API_KEY` | +| OpenRouter | `google/veo-3.1-fast` | ✓ | Up to 4 images (first/last frame or references) | - | `OPENROUTER_API_KEY` | | Qwen | `wan2.6-t2v` | ✓ | Yes (remote URL) | Yes (remote URL) | `QWEN_API_KEY` | | Runway | `gen4.5` | ✓ | 1 image | 1 video | `RUNWAYML_API_SECRET` | -| Together | `Wan-AI/Wan2.2-T2V-A14B` | ✓ | 1 image | — | `TOGETHER_API_KEY` | -| Vydra | `veo3` | ✓ | 1 image (`kling`) | — | `VYDRA_API_KEY` | +| Together | `Wan-AI/Wan2.2-T2V-A14B` | ✓ | 1 image | - | `TOGETHER_API_KEY` | +| Vydra | `veo3` | ✓ | 1 image (`kling`) | - | `VYDRA_API_KEY` | | xAI | `grok-imagine-video` | ✓ | 1 first-frame image or up to 7 `reference_image`s | 1 video | `XAI_API_KEY` | Some providers accept additional or alternate API key env vars. See @@ -139,18 +139,18 @@ the shared live sweep: | Provider | `generate` | `imageToVideo` | `videoToVideo` | Shared live lanes today | | ---------- | :--------: | :------------: | :------------: | ---------------------------------------------------------------------------------------------------------------------------------------- | | Alibaba | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs | -| BytePlus | ✓ | ✓ | — | `generate`, `imageToVideo` | -| ComfyUI | ✓ | ✓ | — | Not in the shared sweep; workflow-specific coverage lives with Comfy tests | -| DeepInfra | ✓ | — | — | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract | +| BytePlus | ✓ | ✓ | - | `generate`, `imageToVideo` | +| ComfyUI | ✓ | ✓ | - | Not in the shared sweep; workflow-specific coverage lives with Comfy tests | +| DeepInfra | ✓ | - | - | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract | | fal | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` only when using Seedance reference-to-video | | Google | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because the current buffer-backed Gemini/Veo sweep does not accept that input | -| MiniMax | ✓ | ✓ | — | `generate`, `imageToVideo` | +| MiniMax | ✓ | ✓ | - | `generate`, `imageToVideo` | | OpenAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because this org/input path currently needs provider-side inpaint/remix access | -| OpenRouter | ✓ | ✓ | — | `generate`, `imageToVideo` | +| OpenRouter | ✓ | ✓ | - | `generate`, `imageToVideo` | | Qwen | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs | | Runway | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` runs only when the selected model is `runway/gen4_aleph` | -| Together | ✓ | ✓ | — | `generate`, `imageToVideo` | -| Vydra | ✓ | ✓ | — | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL | +| Together | ✓ | ✓ | - | `generate`, `imageToVideo` | +| Vydra | ✓ | ✓ | - | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL | | xAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider currently needs a remote MP4 URL | ## Tool parameters @@ -290,10 +290,10 @@ aggregated error includes the skip reason for each. OpenClaw resolves the model in this order: -1. **`model` tool parameter** — if the agent specifies one in the call. +1. **`model` tool parameter** - if the agent specifies one in the call. 2. **`videoGenerationModel.primary`** from config. 3. **`videoGenerationModel.fallbacks`** in order. -4. **Auto-detection** — providers that have valid auth, starting with the +4. **Auto-detection** - providers that have valid auth, starting with the current default provider, then remaining providers in alphabetical order. @@ -336,7 +336,7 @@ only the explicit `model`, `primary`, and `fallbacks` entries. T2V model IDs are automatically switched to the corresponding I2V variant when an image is provided. - Supported `providerOptions` keys: `seed` (number), `draft` (boolean — + Supported `providerOptions` keys: `seed` (number), `draft` (boolean - forces 480p), `camera_fixed` (boolean). @@ -363,7 +363,7 @@ only the explicit `model`, `primary`, and `fallbacks` entries. Uses the unified `content[]` API. Supports up to 9 reference images, 3 reference videos, and 3 reference audios. All inputs must be remote - `https://` URLs. Set `role` on each asset — supported values: + `https://` URLs. Set `role` on each asset - supported values: `"first_frame"`, `"last_frame"`, `"reference_image"`, `"reference_video"`, `"reference_audio"`. @@ -536,7 +536,7 @@ openclaw config set agents.defaults.videoGenerationModel.primary "qwen/wan2.6-t2 ## Related - [Alibaba Model Studio](/providers/alibaba) -- [Background tasks](/automation/tasks) — task tracking for async video generation +- [Background tasks](/automation/tasks) - task tracking for async video generation - [BytePlus](/concepts/model-providers#byteplus-international) - [ComfyUI](/providers/comfy) - [Configuration reference](/gateway/config-agents#agent-defaults)