docs: typography hygiene + drop one in-body H1 across 5 pages

Replaced 138 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules so grep, copy-paste,
and Mintlify search hit clean tokens.

- docs/reference/AGENTS.default.md: 29 chars, plus removed the
  duplicate '# AGENTS.md - OpenClaw Personal Assistant (default)' H1
  (Mintlify renders title from frontmatter; the in-body H1 with
  parens and a bare hyphen produced a brittle anchor).
- docs/help/testing-live.md: 29 chars
- docs/tools/image-generation.md: 28 chars
- docs/channels/index.md: 27 chars
- docs/tools/video-generation.md: 25 chars
This commit is contained in:
Vincent Koc
2026-05-05 19:24:42 -07:00
parent 74532265f4
commit b9f711089a
5 changed files with 106 additions and 108 deletions

View File

@@ -21,32 +21,32 @@ Text is supported everywhere; media and reactions vary by channel.
## Supported channels
- [BlueBubbles](/channels/bluebubbles) **Recommended for iMessage**; uses the BlueBubbles macOS server REST API with full feature support (bundled plugin; edit, unsend, effects, reactions, group management edit currently broken on macOS 26 Tahoe).
- [Discord](/channels/discord) Discord Bot API + Gateway; supports servers, channels, and DMs.
- [Feishu](/channels/feishu) Feishu/Lark bot via WebSocket (bundled plugin).
- [Google Chat](/channels/googlechat) Google Chat API app via HTTP webhook (downloadable plugin).
- [iMessage (legacy)](/channels/imessage) Legacy macOS integration via imsg CLI (deprecated, use BlueBubbles for new setups).
- [IRC](/channels/irc) Classic IRC servers; channels + DMs with pairing/allowlist controls.
- [LINE](/channels/line) LINE Messaging API bot (downloadable plugin).
- [Matrix](/channels/matrix) Matrix protocol (downloadable plugin).
- [Mattermost](/channels/mattermost) Bot API + WebSocket; channels, groups, DMs (downloadable plugin).
- [Microsoft Teams](/channels/msteams) Bot Framework; enterprise support (bundled plugin).
- [Nextcloud Talk](/channels/nextcloud-talk) Self-hosted chat via Nextcloud Talk (bundled plugin).
- [Nostr](/channels/nostr) Decentralized DMs via NIP-04 (bundled plugin).
- [QQ Bot](/channels/qqbot) QQ Bot API; private chat, group chat, and rich media (bundled plugin).
- [Signal](/channels/signal) signal-cli; privacy-focused.
- [Slack](/channels/slack) Bolt SDK; workspace apps.
- [Synology Chat](/channels/synology-chat) Synology NAS Chat via outgoing+incoming webhooks (bundled plugin).
- [Telegram](/channels/telegram) Bot API via grammY; supports groups.
- [Tlon](/channels/tlon) Urbit-based messenger (bundled plugin).
- [Twitch](/channels/twitch) Twitch chat via IRC connection (bundled plugin).
- [Voice Call](/plugins/voice-call) Telephony via Plivo or Twilio (plugin, installed separately).
- [WebChat](/web/webchat) Gateway WebChat UI over WebSocket.
- [WeChat](/channels/wechat) Tencent iLink Bot plugin via QR login; private chats only (external plugin).
- [WhatsApp](/channels/whatsapp) Most popular; uses Baileys and requires QR pairing.
- [Yuanbao](/channels/yuanbao) Tencent Yuanbao bot (external plugin).
- [Zalo](/channels/zalo) Zalo Bot API; Vietnam's popular messenger (bundled plugin).
- [Zalo Personal](/channels/zalouser) Zalo personal account via QR login (bundled plugin).
- [BlueBubbles](/channels/bluebubbles) - **Recommended for iMessage**; uses the BlueBubbles macOS server REST API with full feature support (bundled plugin; edit, unsend, effects, reactions, group management - edit currently broken on macOS 26 Tahoe).
- [Discord](/channels/discord) - Discord Bot API + Gateway; supports servers, channels, and DMs.
- [Feishu](/channels/feishu) - Feishu/Lark bot via WebSocket (bundled plugin).
- [Google Chat](/channels/googlechat) - Google Chat API app via HTTP webhook (downloadable plugin).
- [iMessage (legacy)](/channels/imessage) - Legacy macOS integration via imsg CLI (deprecated, use BlueBubbles for new setups).
- [IRC](/channels/irc) - Classic IRC servers; channels + DMs with pairing/allowlist controls.
- [LINE](/channels/line) - LINE Messaging API bot (downloadable plugin).
- [Matrix](/channels/matrix) - Matrix protocol (downloadable plugin).
- [Mattermost](/channels/mattermost) - Bot API + WebSocket; channels, groups, DMs (downloadable plugin).
- [Microsoft Teams](/channels/msteams) - Bot Framework; enterprise support (bundled plugin).
- [Nextcloud Talk](/channels/nextcloud-talk) - Self-hosted chat via Nextcloud Talk (bundled plugin).
- [Nostr](/channels/nostr) - Decentralized DMs via NIP-04 (bundled plugin).
- [QQ Bot](/channels/qqbot) - QQ Bot API; private chat, group chat, and rich media (bundled plugin).
- [Signal](/channels/signal) - signal-cli; privacy-focused.
- [Slack](/channels/slack) - Bolt SDK; workspace apps.
- [Synology Chat](/channels/synology-chat) - Synology NAS Chat via outgoing+incoming webhooks (bundled plugin).
- [Telegram](/channels/telegram) - Bot API via grammY; supports groups.
- [Tlon](/channels/tlon) - Urbit-based messenger (bundled plugin).
- [Twitch](/channels/twitch) - Twitch chat via IRC connection (bundled plugin).
- [Voice Call](/plugins/voice-call) - Telephony via Plivo or Twilio (plugin, installed separately).
- [WebChat](/web/webchat) - Gateway WebChat UI over WebSocket.
- [WeChat](/channels/wechat) - Tencent iLink Bot plugin via QR login; private chats only (external plugin).
- [WhatsApp](/channels/whatsapp) - Most popular; uses Baileys and requires QR pairing.
- [Yuanbao](/channels/yuanbao) - Tencent Yuanbao bot (external plugin).
- [Zalo](/channels/zalo) - Zalo Bot API; Vietnam's popular messenger (bundled plugin).
- [Zalo Personal](/channels/zalouser) - Zalo personal account via QR login (bundled plugin).
## Notes

View File

@@ -63,8 +63,8 @@ loopback/private fallbacks are rejected by design.
Live tests are split into two layers so we can isolate failures:
- Direct model tells us the provider/model can answer at all with the given key.
- Gateway smoke tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.).
- "Direct model" tells us the provider/model can answer at all with the given key.
- "Gateway smoke" tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.).
### Layer 1: Direct model completion (no gateway)
@@ -89,7 +89,7 @@ Live tests are split into two layers so we can isolate failures:
- By default: profile store and env fallbacks
- Set `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to enforce **profile store** only
- Why this exists:
- Separates provider API is broken / key is invalid from gateway agent pipeline is broken
- Separates "provider API is broken / key is invalid" from "gateway agent pipeline is broken"
- Contains small, isolated regressions (example: OpenAI Responses/Codex Responses reasoning replay + tool-call flows)
### Layer 2: Gateway + dev agent smoke (what "@openclaw" actually does)
@@ -99,7 +99,7 @@ Live tests are split into two layers so we can isolate failures:
- Spin up an in-process gateway
- Create/patch a `agent:dev:*` session (model override per run)
- Iterate models-with-keys and assert:
- meaningful response (no tools)
- "meaningful" response (no tools)
- a real tool invocation works (read probe)
- optional extra tool probes (exec+read probe)
- OpenAI regression paths (tool-call-only → follow-up) keep working
@@ -115,13 +115,13 @@ Live tests are split into two layers so we can isolate failures:
- `OPENCLAW_LIVE_GATEWAY_MODELS=all` is an alias for the modern allowlist
- Or set `OPENCLAW_LIVE_GATEWAY_MODELS="provider/model"` (or comma list) to narrow
- Modern/all gateway sweeps default to a curated high-signal cap; set `OPENCLAW_LIVE_GATEWAY_MAX_MODELS=0` for an exhaustive modern sweep or a positive number for a smaller cap.
- How to select providers (avoid OpenRouter everything):
- How to select providers (avoid "OpenRouter everything"):
- `OPENCLAW_LIVE_GATEWAY_PROVIDERS="google,google-antigravity,google-gemini-cli,openai,anthropic,zai,minimax"` (comma allowlist)
- Tool + image probes are always on in this live test:
- `read` probe + `exec+read` probe (tool stress)
- image probe runs when the model advertises image input support
- Flow (high level):
- Test generates a tiny PNG with CAT + random code (`src/gateway/live-image-probe.ts`)
- Test generates a tiny PNG with "CAT" + random code (`src/gateway/live-image-probe.ts`)
- Sends it via `agent` `attachments: [{ mimeType: "image/png", content: "<base64>" }]`
- Gateway parses attachments into `images[]` (`src/gateway/server-methods/agent.ts` + `src/gateway/chat-attachments.ts`)
- Embedded agent forwards a multimodal user message to the model
@@ -367,16 +367,16 @@ Notes:
- `google-antigravity/...` uses the Antigravity OAuth bridge (Cloud Code Assist-style agent endpoint).
- `google-gemini-cli/...` uses the local Gemini CLI on your machine (separate auth + tooling quirks).
- Gemini API vs Gemini CLI:
- API: OpenClaw calls Googles hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by Gemini.
- API: OpenClaw calls Google's hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by "Gemini".
- CLI: OpenClaw shells out to a local `gemini` binary; it has its own auth and can behave differently (streaming/tool support/version skew).
## Live: model matrix (what we cover)
There is no fixed CI model list (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
There is no fixed "CI model list" (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
### Modern smoke set (tool calling + image)
This is the common models run we expect to keep working:
This is the "common models" run we expect to keep working:
- OpenAI (non-Codex): `openai/gpt-5.5`
- OpenAI Codex OAuth: `openai-codex/gpt-5.5`
@@ -404,7 +404,7 @@ Pick at least one per provider family:
Optional additional coverage (nice to have):
- xAI: `xai/grok-4.3` (or latest available)
- Mistral: `mistral/`… (pick one tools capable model you have enabled)
- Mistral: `mistral/`… (pick one "tools" capable model you have enabled)
- Cerebras: `cerebras/`… (if you have access)
- LM Studio: `lmstudio/`… (local; tool calling depends on API mode)
@@ -433,9 +433,9 @@ Do not hardcode "all models" in docs. The authoritative list is whatever `discov
Live tests discover credentials the same way the CLI does. Practical implications:
- If the CLI works, live tests should find the same keys.
- If a live test says no creds, debug the same way youd debug `openclaw models list` / model selection.
- If a live test says "no creds", debug the same way you'd debug `openclaw models list` / model selection.
- Per-agent auth profiles: `~/.openclaw/agents/<agentId>/agent/auth-profiles.json` (this is what profile keys means in the live tests)
- Per-agent auth profiles: `~/.openclaw/agents/<agentId>/agent/auth-profiles.json` (this is what "profile keys" means in the live tests)
- Config: `~/.openclaw/openclaw.json` (or `OPENCLAW_CONFIG_PATH`)
- Legacy state dir: `~/.openclaw/credentials/` (copied into the staged live home when present, but not the main profile-key store)
- Live local runs copy the active config, per-agent `auth-profiles.json` files, legacy `credentials/`, and supported external CLI auth dirs into a temp test home by default; staged live homes skip `workspace/` and `sandboxes/`, and `agents.*.workspace` / `agentDir` path overrides are stripped so probes stay off your real host workspace.
@@ -584,4 +584,4 @@ request. Plugin dependencies are expected to be present before runtime load.
## Related
- [Testing](/help/testing) unit, integration, QA, and Docker suites
- [Testing](/help/testing) - unit, integration, QA, and Docker suites

View File

@@ -6,13 +6,11 @@ read_when:
- Enabling or auditing default skills
---
# AGENTS.md - OpenClaw Personal Assistant (default)
## First run (recommended)
OpenClaw uses a dedicated workspace directory for the agent. Default: `~/.openclaw/workspace` (configurable via `agents.defaults.workspace`).
1. Create the workspace (if it doesnt already exist):
1. Create the workspace (if it doesn't already exist):
```bash
mkdir -p ~/.openclaw/workspace
@@ -42,9 +40,9 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md
## Safety defaults
- Dont dump directories or secrets into chat.
- Dont run destructive commands unless explicitly asked.
- Dont send partial/streaming replies to external messaging surfaces (only final replies).
- Don't dump directories or secrets into chat.
- Don't run destructive commands unless explicitly asked.
- Don't send partial/streaming replies to external messaging surfaces (only final replies).
## Session start (required)
@@ -60,8 +58,8 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md
## Shared spaces (recommended)
- Youre not the users voice; be careful in group chats or public channels.
- Dont share private data, contact info, or internal notes.
- You're not the user's voice; be careful in group chats or public channels.
- Don't share private data, contact info, or internal notes.
## Memory system (recommended)
@@ -74,12 +72,12 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md
## Tools and skills
- Tools live in skills; follow each skills `SKILL.md` when you need it.
- Tools live in skills; follow each skill's `SKILL.md` when you need it.
- Keep environment-specific notes in `TOOLS.md` (Notes for Skills).
## Backup tip (recommended)
If you treat this workspace as Clawds memory, make it a git repo (ideally private) so `AGENTS.md` and your memory files are backed up.
If you treat this workspace as Clawd's "memory", make it a git repo (ideally private) so `AGENTS.md` and your memory files are backed up.
```bash
cd ~/.openclaw/workspace
@@ -97,30 +95,30 @@ git commit -m "Add Clawd workspace"
## Core skills (enable in Settings → Skills)
- **mcporter** Tool server runtime/CLI for managing external skill backends.
- **Peekaboo** Fast macOS screenshots with optional AI vision analysis.
- **camsnap** Capture frames, clips, or motion alerts from RTSP/ONVIF security cams.
- **oracle** OpenAI-ready agent CLI with session replay and browser control.
- **eightctl** Control your sleep, from the terminal.
- **imsg** Send, read, stream iMessage & SMS.
- **wacli** WhatsApp CLI: sync, search, send.
- **discord** Discord actions: react, stickers, polls. Use `user:<id>` or `channel:<id>` targets (bare numeric ids are ambiguous).
- **gog** Google Suite CLI: Gmail, Calendar, Drive, Contacts.
- **spotify-player** Terminal Spotify client to search/queue/control playback.
- **sag** ElevenLabs speech with mac-style say UX; streams to speakers by default.
- **Sonos CLI** Control Sonos speakers (discover/status/playback/volume/grouping) from scripts.
- **blucli** Play, group, and automate BluOS players from scripts.
- **OpenHue CLI** Philips Hue lighting control for scenes and automations.
- **OpenAI Whisper** Local speech-to-text for quick dictation and voicemail transcripts.
- **Gemini CLI** Google Gemini models from the terminal for fast Q&A.
- **agent-tools** Utility toolkit for automations and helper scripts.
- **mcporter** - Tool server runtime/CLI for managing external skill backends.
- **Peekaboo** - Fast macOS screenshots with optional AI vision analysis.
- **camsnap** - Capture frames, clips, or motion alerts from RTSP/ONVIF security cams.
- **oracle** - OpenAI-ready agent CLI with session replay and browser control.
- **eightctl** - Control your sleep, from the terminal.
- **imsg** - Send, read, stream iMessage & SMS.
- **wacli** - WhatsApp CLI: sync, search, send.
- **discord** - Discord actions: react, stickers, polls. Use `user:<id>` or `channel:<id>` targets (bare numeric ids are ambiguous).
- **gog** - Google Suite CLI: Gmail, Calendar, Drive, Contacts.
- **spotify-player** - Terminal Spotify client to search/queue/control playback.
- **sag** - ElevenLabs speech with mac-style say UX; streams to speakers by default.
- **Sonos CLI** - Control Sonos speakers (discover/status/playback/volume/grouping) from scripts.
- **blucli** - Play, group, and automate BluOS players from scripts.
- **OpenHue CLI** - Philips Hue lighting control for scenes and automations.
- **OpenAI Whisper** - Local speech-to-text for quick dictation and voicemail transcripts.
- **Gemini CLI** - Google Gemini models from the terminal for fast Q&A.
- **agent-tools** - Utility toolkit for automations and helper scripts.
## Usage notes
- Prefer the `openclaw` CLI for scripting; mac app handles permissions.
- Run installs from the Skills tab; it hides the button if a binary is already present.
- Keep heartbeats enabled so the assistant can schedule reminders, monitor inboxes, and trigger camera captures.
- Canvas UI runs full-screen with native overlays. Avoid placing critical controls in the top-left/top-right/bottom edges; add explicit gutters in the layout and dont rely on safe-area insets.
- Canvas UI runs full-screen with native overlays. Avoid placing critical controls in the top-left/top-right/bottom edges; add explicit gutters in the layout and don't rely on safe-area insets.
- For browser-driven verification, use `openclaw browser` (tabs/status/screenshot) with the OpenClaw-managed Chrome profile.
- For DOM inspection, use `openclaw browser eval|query|dom|snapshot` (and `--json`/`--out` when you need machine output).
- For interactions, use `openclaw browser click|type|hover|drag|select|upload|press|wait|navigate|back|evaluate|run` (click/type require snapshot refs; use `evaluate` for CSS selectors).

View File

@@ -52,7 +52,7 @@ or sign in with OpenAI Codex OAuth.
_"Generate an image of a friendly robot mascot."_
The agent calls `image_generate` automatically. No tool allow-listing
needed it is enabled by default when a provider is available.
needed - it is enabled by default when a provider is available.
</Step>
</Steps>
@@ -110,10 +110,10 @@ Use `action: "list"` to inspect available providers and models at runtime:
| Capability | ComfyUI | DeepInfra | fal | Google | MiniMax | OpenAI | Vydra | xAI |
| --------------------- | ------------------ | --------- | ----------------- | -------------- | --------------------- | -------------- | ----- | -------------- |
| Generate (max count) | Workflow-defined | 4 | 4 | 4 | 9 | 4 | 1 | 4 |
| Edit / reference | 1 image (workflow) | 1 image | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | | Up to 5 images |
| Size control | | ✓ | ✓ | ✓ | | Up to 4K | | |
| Aspect ratio | | | ✓ (generate only) | ✓ | ✓ | | | ✓ |
| Resolution (1K/2K/4K) | | | ✓ | ✓ | | | | 1K, 2K |
| Edit / reference | 1 image (workflow) | 1 image | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | - | Up to 5 images |
| Size control | - | ✓ | ✓ | ✓ | - | Up to 4K | - | - |
| Aspect ratio | - | - | ✓ (generate only) | ✓ | ✓ | - | - | ✓ |
| Resolution (1K/2K/4K) | - | - | ✓ | ✓ | - | - | - | 1K, 2K |
## Tool parameters
@@ -150,7 +150,7 @@ Use `action: "list"` to inspect available providers and models at runtime:
Background hint when the provider supports it. Use `transparent` with
`outputFormat: "png"` or `"webp"` for transparency-capable providers.
</ParamField>
<ParamField path="count" type="number">Number of images to generate (14).</ParamField>
<ParamField path="count" type="number">Number of images to generate (1-4).</ParamField>
<ParamField path="timeoutMs" type="number">Optional provider request timeout in milliseconds.</ParamField>
<ParamField path="filename" type="string">Output filename hint.</ParamField>
<ParamField path="openai" type="object">
@@ -196,7 +196,7 @@ OpenClaw tries providers in this order:
1. **`model` parameter** from the tool call (if the agent specifies one).
2. **`imageGenerationModel.primary`** from config.
3. **`imageGenerationModel.fallbacks`** in order.
4. **Auto-detection** auth-backed provider defaults only:
4. **Auto-detection** - auth-backed provider defaults only:
- current default provider first;
- remaining registered image-generation providers in provider-id order.
@@ -248,7 +248,7 @@ OpenAI, OpenRouter, Google, and xAI support up to 5 reference images via the
image request through the Codex Responses backend. Legacy Codex base
URLs such as `https://chatgpt.com/backend-api` are canonicalized to
`https://chatgpt.com/backend-api/codex` for image requests. OpenClaw
does **not** silently fall back to `OPENAI_API_KEY` for that request
does **not** silently fall back to `OPENAI_API_KEY` for that request -
to force direct OpenAI Images API routing, configure
`models.providers.openai` explicitly with an API key, custom base URL,
or Azure endpoint.
@@ -398,13 +398,13 @@ as ignored for them.
## Related
- [Tools overview](/tools) all available agent tools
- [ComfyUI](/providers/comfy) local ComfyUI and Comfy Cloud workflow setup
- [fal](/providers/fal) fal image and video provider setup
- [Google (Gemini)](/providers/google) Gemini image provider setup
- [MiniMax](/providers/minimax) MiniMax image provider setup
- [OpenAI](/providers/openai) OpenAI Images provider setup
- [Vydra](/providers/vydra) Vydra image, video, and speech setup
- [xAI](/providers/xai) Grok image, video, search, code execution, and TTS setup
- [Configuration reference](/gateway/config-agents#agent-defaults) `imageGenerationModel` config
- [Models](/concepts/models) model configuration and failover
- [Tools overview](/tools) - all available agent tools
- [ComfyUI](/providers/comfy) - local ComfyUI and Comfy Cloud workflow setup
- [fal](/providers/fal) - fal image and video provider setup
- [Google (Gemini)](/providers/google) - Gemini image provider setup
- [MiniMax](/providers/minimax) - MiniMax image provider setup
- [OpenAI](/providers/openai) - OpenAI Images provider setup
- [Vydra](/providers/vydra) - Vydra image, video, and speech setup
- [xAI](/providers/xai) - Grok image, video, search, code execution, and TTS setup
- [Configuration reference](/gateway/config-agents#agent-defaults) - `imageGenerationModel` config
- [Models](/concepts/models) - model configuration and failover

View File

@@ -22,9 +22,9 @@ provider API key or configure `agents.defaults.videoGenerationModel`.
OpenClaw treats video generation as three runtime modes:
- `generate` text-to-video requests with no reference media.
- `imageToVideo` request includes one or more reference images.
- `videoToVideo` request includes one or more reference videos.
- `generate` - text-to-video requests with no reference media.
- `imageToVideo` - request includes one or more reference images.
- `videoToVideo` - request includes one or more reference videos.
Providers can support any subset of those modes. The tool validates the
active mode before submission and reports supported modes in `action=list`.
@@ -109,20 +109,20 @@ generation.
| Provider | Default model | Text | Image ref | Video ref | Auth |
| --------------------- | ------------------------------- | :--: | ---------------------------------------------------- | ----------------------------------------------- | ---------------------------------------- |
| Alibaba | `wan2.6-t2v` | ✓ | Yes (remote URL) | Yes (remote URL) | `MODELSTUDIO_API_KEY` |
| BytePlus (1.0) | `seedance-1-0-pro-250528` | ✓ | Up to 2 images (I2V models only; first + last frame) | | `BYTEPLUS_API_KEY` |
| BytePlus Seedance 1.5 | `seedance-1-5-pro-251215` | ✓ | Up to 2 images (first + last frame via role) | | `BYTEPLUS_API_KEY` |
| BytePlus (1.0) | `seedance-1-0-pro-250528` | ✓ | Up to 2 images (I2V models only; first + last frame) | - | `BYTEPLUS_API_KEY` |
| BytePlus Seedance 1.5 | `seedance-1-5-pro-251215` | ✓ | Up to 2 images (first + last frame via role) | - | `BYTEPLUS_API_KEY` |
| BytePlus Seedance 2.0 | `dreamina-seedance-2-0-260128` | ✓ | Up to 9 reference images | Up to 3 videos | `BYTEPLUS_API_KEY` |
| ComfyUI | `workflow` | ✓ | 1 image | | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` |
| DeepInfra | `Pixverse/Pixverse-T2V` | ✓ | | | `DEEPINFRA_API_KEY` |
| ComfyUI | `workflow` | ✓ | 1 image | - | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` |
| DeepInfra | `Pixverse/Pixverse-T2V` | ✓ | - | - | `DEEPINFRA_API_KEY` |
| fal | `fal-ai/minimax/video-01-live` | ✓ | 1 image; up to 9 with Seedance reference-to-video | Up to 3 videos with Seedance reference-to-video | `FAL_KEY` |
| Google | `veo-3.1-fast-generate-preview` | ✓ | 1 image | 1 video | `GEMINI_API_KEY` |
| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | | `MINIMAX_API_KEY` or MiniMax OAuth |
| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | - | `MINIMAX_API_KEY` or MiniMax OAuth |
| OpenAI | `sora-2` | ✓ | 1 image | 1 video | `OPENAI_API_KEY` |
| OpenRouter | `google/veo-3.1-fast` | ✓ | Up to 4 images (first/last frame or references) | | `OPENROUTER_API_KEY` |
| OpenRouter | `google/veo-3.1-fast` | ✓ | Up to 4 images (first/last frame or references) | - | `OPENROUTER_API_KEY` |
| Qwen | `wan2.6-t2v` | ✓ | Yes (remote URL) | Yes (remote URL) | `QWEN_API_KEY` |
| Runway | `gen4.5` | ✓ | 1 image | 1 video | `RUNWAYML_API_SECRET` |
| Together | `Wan-AI/Wan2.2-T2V-A14B` | ✓ | 1 image | | `TOGETHER_API_KEY` |
| Vydra | `veo3` | ✓ | 1 image (`kling`) | | `VYDRA_API_KEY` |
| Together | `Wan-AI/Wan2.2-T2V-A14B` | ✓ | 1 image | - | `TOGETHER_API_KEY` |
| Vydra | `veo3` | ✓ | 1 image (`kling`) | - | `VYDRA_API_KEY` |
| xAI | `grok-imagine-video` | ✓ | 1 first-frame image or up to 7 `reference_image`s | 1 video | `XAI_API_KEY` |
Some providers accept additional or alternate API key env vars. See
@@ -139,18 +139,18 @@ the shared live sweep:
| Provider | `generate` | `imageToVideo` | `videoToVideo` | Shared live lanes today |
| ---------- | :--------: | :------------: | :------------: | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Alibaba | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| BytePlus | ✓ | ✓ | | `generate`, `imageToVideo` |
| ComfyUI | ✓ | ✓ | | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
| DeepInfra | ✓ | | | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract |
| BytePlus | ✓ | ✓ | - | `generate`, `imageToVideo` |
| ComfyUI | ✓ | ✓ | - | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
| DeepInfra | ✓ | - | - | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract |
| fal | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` only when using Seedance reference-to-video |
| Google | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because the current buffer-backed Gemini/Veo sweep does not accept that input |
| MiniMax | ✓ | ✓ | | `generate`, `imageToVideo` |
| MiniMax | ✓ | ✓ | - | `generate`, `imageToVideo` |
| OpenAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because this org/input path currently needs provider-side inpaint/remix access |
| OpenRouter | ✓ | ✓ | | `generate`, `imageToVideo` |
| OpenRouter | ✓ | ✓ | - | `generate`, `imageToVideo` |
| Qwen | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| Runway | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` runs only when the selected model is `runway/gen4_aleph` |
| Together | ✓ | ✓ | | `generate`, `imageToVideo` |
| Vydra | ✓ | ✓ | | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
| Together | ✓ | ✓ | - | `generate`, `imageToVideo` |
| Vydra | ✓ | ✓ | - | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
| xAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider currently needs a remote MP4 URL |
## Tool parameters
@@ -290,10 +290,10 @@ aggregated error includes the skip reason for each.
OpenClaw resolves the model in this order:
1. **`model` tool parameter** if the agent specifies one in the call.
1. **`model` tool parameter** - if the agent specifies one in the call.
2. **`videoGenerationModel.primary`** from config.
3. **`videoGenerationModel.fallbacks`** in order.
4. **Auto-detection** providers that have valid auth, starting with the
4. **Auto-detection** - providers that have valid auth, starting with the
current default provider, then remaining providers in alphabetical
order.
@@ -336,7 +336,7 @@ only the explicit `model`, `primary`, and `fallbacks` entries.
T2V model IDs are automatically switched to the corresponding I2V
variant when an image is provided.
Supported `providerOptions` keys: `seed` (number), `draft` (boolean
Supported `providerOptions` keys: `seed` (number), `draft` (boolean -
forces 480p), `camera_fixed` (boolean).
</Accordion>
@@ -363,7 +363,7 @@ only the explicit `model`, `primary`, and `fallbacks` entries.
Uses the unified `content[]` API. Supports up to 9 reference images,
3 reference videos, and 3 reference audios. All inputs must be remote
`https://` URLs. Set `role` on each asset supported values:
`https://` URLs. Set `role` on each asset - supported values:
`"first_frame"`, `"last_frame"`, `"reference_image"`,
`"reference_video"`, `"reference_audio"`.
@@ -536,7 +536,7 @@ openclaw config set agents.defaults.videoGenerationModel.primary "qwen/wan2.6-t2
## Related
- [Alibaba Model Studio](/providers/alibaba)
- [Background tasks](/automation/tasks) task tracking for async video generation
- [Background tasks](/automation/tasks) - task tracking for async video generation
- [BytePlus](/concepts/model-providers#byteplus-international)
- [ComfyUI](/providers/comfy)
- [Configuration reference](/gateway/config-agents#agent-defaults)