mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 05:20:43 +00:00
docs: typography hygiene + drop one in-body H1 across 5 pages
Replaced 138 typography characters (curly quotes, apostrophes, em/en dashes, non-breaking hyphens) with ASCII equivalents per docs/CLAUDE.md heading and content hygiene rules so grep, copy-paste, and Mintlify search hit clean tokens. - docs/reference/AGENTS.default.md: 29 chars, plus removed the duplicate '# AGENTS.md - OpenClaw Personal Assistant (default)' H1 (Mintlify renders title from frontmatter; the in-body H1 with parens and a bare hyphen produced a brittle anchor). - docs/help/testing-live.md: 29 chars - docs/tools/image-generation.md: 28 chars - docs/channels/index.md: 27 chars - docs/tools/video-generation.md: 25 chars
This commit is contained in:
@@ -21,32 +21,32 @@ Text is supported everywhere; media and reactions vary by channel.
|
||||
|
||||
## Supported channels
|
||||
|
||||
- [BlueBubbles](/channels/bluebubbles) — **Recommended for iMessage**; uses the BlueBubbles macOS server REST API with full feature support (bundled plugin; edit, unsend, effects, reactions, group management — edit currently broken on macOS 26 Tahoe).
|
||||
- [Discord](/channels/discord) — Discord Bot API + Gateway; supports servers, channels, and DMs.
|
||||
- [Feishu](/channels/feishu) — Feishu/Lark bot via WebSocket (bundled plugin).
|
||||
- [Google Chat](/channels/googlechat) — Google Chat API app via HTTP webhook (downloadable plugin).
|
||||
- [iMessage (legacy)](/channels/imessage) — Legacy macOS integration via imsg CLI (deprecated, use BlueBubbles for new setups).
|
||||
- [IRC](/channels/irc) — Classic IRC servers; channels + DMs with pairing/allowlist controls.
|
||||
- [LINE](/channels/line) — LINE Messaging API bot (downloadable plugin).
|
||||
- [Matrix](/channels/matrix) — Matrix protocol (downloadable plugin).
|
||||
- [Mattermost](/channels/mattermost) — Bot API + WebSocket; channels, groups, DMs (downloadable plugin).
|
||||
- [Microsoft Teams](/channels/msteams) — Bot Framework; enterprise support (bundled plugin).
|
||||
- [Nextcloud Talk](/channels/nextcloud-talk) — Self-hosted chat via Nextcloud Talk (bundled plugin).
|
||||
- [Nostr](/channels/nostr) — Decentralized DMs via NIP-04 (bundled plugin).
|
||||
- [QQ Bot](/channels/qqbot) — QQ Bot API; private chat, group chat, and rich media (bundled plugin).
|
||||
- [Signal](/channels/signal) — signal-cli; privacy-focused.
|
||||
- [Slack](/channels/slack) — Bolt SDK; workspace apps.
|
||||
- [Synology Chat](/channels/synology-chat) — Synology NAS Chat via outgoing+incoming webhooks (bundled plugin).
|
||||
- [Telegram](/channels/telegram) — Bot API via grammY; supports groups.
|
||||
- [Tlon](/channels/tlon) — Urbit-based messenger (bundled plugin).
|
||||
- [Twitch](/channels/twitch) — Twitch chat via IRC connection (bundled plugin).
|
||||
- [Voice Call](/plugins/voice-call) — Telephony via Plivo or Twilio (plugin, installed separately).
|
||||
- [WebChat](/web/webchat) — Gateway WebChat UI over WebSocket.
|
||||
- [WeChat](/channels/wechat) — Tencent iLink Bot plugin via QR login; private chats only (external plugin).
|
||||
- [WhatsApp](/channels/whatsapp) — Most popular; uses Baileys and requires QR pairing.
|
||||
- [Yuanbao](/channels/yuanbao) — Tencent Yuanbao bot (external plugin).
|
||||
- [Zalo](/channels/zalo) — Zalo Bot API; Vietnam's popular messenger (bundled plugin).
|
||||
- [Zalo Personal](/channels/zalouser) — Zalo personal account via QR login (bundled plugin).
|
||||
- [BlueBubbles](/channels/bluebubbles) - **Recommended for iMessage**; uses the BlueBubbles macOS server REST API with full feature support (bundled plugin; edit, unsend, effects, reactions, group management - edit currently broken on macOS 26 Tahoe).
|
||||
- [Discord](/channels/discord) - Discord Bot API + Gateway; supports servers, channels, and DMs.
|
||||
- [Feishu](/channels/feishu) - Feishu/Lark bot via WebSocket (bundled plugin).
|
||||
- [Google Chat](/channels/googlechat) - Google Chat API app via HTTP webhook (downloadable plugin).
|
||||
- [iMessage (legacy)](/channels/imessage) - Legacy macOS integration via imsg CLI (deprecated, use BlueBubbles for new setups).
|
||||
- [IRC](/channels/irc) - Classic IRC servers; channels + DMs with pairing/allowlist controls.
|
||||
- [LINE](/channels/line) - LINE Messaging API bot (downloadable plugin).
|
||||
- [Matrix](/channels/matrix) - Matrix protocol (downloadable plugin).
|
||||
- [Mattermost](/channels/mattermost) - Bot API + WebSocket; channels, groups, DMs (downloadable plugin).
|
||||
- [Microsoft Teams](/channels/msteams) - Bot Framework; enterprise support (bundled plugin).
|
||||
- [Nextcloud Talk](/channels/nextcloud-talk) - Self-hosted chat via Nextcloud Talk (bundled plugin).
|
||||
- [Nostr](/channels/nostr) - Decentralized DMs via NIP-04 (bundled plugin).
|
||||
- [QQ Bot](/channels/qqbot) - QQ Bot API; private chat, group chat, and rich media (bundled plugin).
|
||||
- [Signal](/channels/signal) - signal-cli; privacy-focused.
|
||||
- [Slack](/channels/slack) - Bolt SDK; workspace apps.
|
||||
- [Synology Chat](/channels/synology-chat) - Synology NAS Chat via outgoing+incoming webhooks (bundled plugin).
|
||||
- [Telegram](/channels/telegram) - Bot API via grammY; supports groups.
|
||||
- [Tlon](/channels/tlon) - Urbit-based messenger (bundled plugin).
|
||||
- [Twitch](/channels/twitch) - Twitch chat via IRC connection (bundled plugin).
|
||||
- [Voice Call](/plugins/voice-call) - Telephony via Plivo or Twilio (plugin, installed separately).
|
||||
- [WebChat](/web/webchat) - Gateway WebChat UI over WebSocket.
|
||||
- [WeChat](/channels/wechat) - Tencent iLink Bot plugin via QR login; private chats only (external plugin).
|
||||
- [WhatsApp](/channels/whatsapp) - Most popular; uses Baileys and requires QR pairing.
|
||||
- [Yuanbao](/channels/yuanbao) - Tencent Yuanbao bot (external plugin).
|
||||
- [Zalo](/channels/zalo) - Zalo Bot API; Vietnam's popular messenger (bundled plugin).
|
||||
- [Zalo Personal](/channels/zalouser) - Zalo personal account via QR login (bundled plugin).
|
||||
|
||||
## Notes
|
||||
|
||||
|
||||
@@ -63,8 +63,8 @@ loopback/private fallbacks are rejected by design.
|
||||
|
||||
Live tests are split into two layers so we can isolate failures:
|
||||
|
||||
- “Direct model” tells us the provider/model can answer at all with the given key.
|
||||
- “Gateway smoke” tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.).
|
||||
- "Direct model" tells us the provider/model can answer at all with the given key.
|
||||
- "Gateway smoke" tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.).
|
||||
|
||||
### Layer 1: Direct model completion (no gateway)
|
||||
|
||||
@@ -89,7 +89,7 @@ Live tests are split into two layers so we can isolate failures:
|
||||
- By default: profile store and env fallbacks
|
||||
- Set `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to enforce **profile store** only
|
||||
- Why this exists:
|
||||
- Separates “provider API is broken / key is invalid” from “gateway agent pipeline is broken”
|
||||
- Separates "provider API is broken / key is invalid" from "gateway agent pipeline is broken"
|
||||
- Contains small, isolated regressions (example: OpenAI Responses/Codex Responses reasoning replay + tool-call flows)
|
||||
|
||||
### Layer 2: Gateway + dev agent smoke (what "@openclaw" actually does)
|
||||
@@ -99,7 +99,7 @@ Live tests are split into two layers so we can isolate failures:
|
||||
- Spin up an in-process gateway
|
||||
- Create/patch a `agent:dev:*` session (model override per run)
|
||||
- Iterate models-with-keys and assert:
|
||||
- “meaningful” response (no tools)
|
||||
- "meaningful" response (no tools)
|
||||
- a real tool invocation works (read probe)
|
||||
- optional extra tool probes (exec+read probe)
|
||||
- OpenAI regression paths (tool-call-only → follow-up) keep working
|
||||
@@ -115,13 +115,13 @@ Live tests are split into two layers so we can isolate failures:
|
||||
- `OPENCLAW_LIVE_GATEWAY_MODELS=all` is an alias for the modern allowlist
|
||||
- Or set `OPENCLAW_LIVE_GATEWAY_MODELS="provider/model"` (or comma list) to narrow
|
||||
- Modern/all gateway sweeps default to a curated high-signal cap; set `OPENCLAW_LIVE_GATEWAY_MAX_MODELS=0` for an exhaustive modern sweep or a positive number for a smaller cap.
|
||||
- How to select providers (avoid “OpenRouter everything”):
|
||||
- How to select providers (avoid "OpenRouter everything"):
|
||||
- `OPENCLAW_LIVE_GATEWAY_PROVIDERS="google,google-antigravity,google-gemini-cli,openai,anthropic,zai,minimax"` (comma allowlist)
|
||||
- Tool + image probes are always on in this live test:
|
||||
- `read` probe + `exec+read` probe (tool stress)
|
||||
- image probe runs when the model advertises image input support
|
||||
- Flow (high level):
|
||||
- Test generates a tiny PNG with “CAT” + random code (`src/gateway/live-image-probe.ts`)
|
||||
- Test generates a tiny PNG with "CAT" + random code (`src/gateway/live-image-probe.ts`)
|
||||
- Sends it via `agent` `attachments: [{ mimeType: "image/png", content: "<base64>" }]`
|
||||
- Gateway parses attachments into `images[]` (`src/gateway/server-methods/agent.ts` + `src/gateway/chat-attachments.ts`)
|
||||
- Embedded agent forwards a multimodal user message to the model
|
||||
@@ -367,16 +367,16 @@ Notes:
|
||||
- `google-antigravity/...` uses the Antigravity OAuth bridge (Cloud Code Assist-style agent endpoint).
|
||||
- `google-gemini-cli/...` uses the local Gemini CLI on your machine (separate auth + tooling quirks).
|
||||
- Gemini API vs Gemini CLI:
|
||||
- API: OpenClaw calls Google’s hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by “Gemini”.
|
||||
- API: OpenClaw calls Google's hosted Gemini API over HTTP (API key / profile auth); this is what most users mean by "Gemini".
|
||||
- CLI: OpenClaw shells out to a local `gemini` binary; it has its own auth and can behave differently (streaming/tool support/version skew).
|
||||
|
||||
## Live: model matrix (what we cover)
|
||||
|
||||
There is no fixed “CI model list” (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
|
||||
There is no fixed "CI model list" (live is opt-in), but these are the **recommended** models to cover regularly on a dev machine with keys.
|
||||
|
||||
### Modern smoke set (tool calling + image)
|
||||
|
||||
This is the “common models” run we expect to keep working:
|
||||
This is the "common models" run we expect to keep working:
|
||||
|
||||
- OpenAI (non-Codex): `openai/gpt-5.5`
|
||||
- OpenAI Codex OAuth: `openai-codex/gpt-5.5`
|
||||
@@ -404,7 +404,7 @@ Pick at least one per provider family:
|
||||
Optional additional coverage (nice to have):
|
||||
|
||||
- xAI: `xai/grok-4.3` (or latest available)
|
||||
- Mistral: `mistral/`… (pick one “tools” capable model you have enabled)
|
||||
- Mistral: `mistral/`… (pick one "tools" capable model you have enabled)
|
||||
- Cerebras: `cerebras/`… (if you have access)
|
||||
- LM Studio: `lmstudio/`… (local; tool calling depends on API mode)
|
||||
|
||||
@@ -433,9 +433,9 @@ Do not hardcode "all models" in docs. The authoritative list is whatever `discov
|
||||
Live tests discover credentials the same way the CLI does. Practical implications:
|
||||
|
||||
- If the CLI works, live tests should find the same keys.
|
||||
- If a live test says “no creds”, debug the same way you’d debug `openclaw models list` / model selection.
|
||||
- If a live test says "no creds", debug the same way you'd debug `openclaw models list` / model selection.
|
||||
|
||||
- Per-agent auth profiles: `~/.openclaw/agents/<agentId>/agent/auth-profiles.json` (this is what “profile keys” means in the live tests)
|
||||
- Per-agent auth profiles: `~/.openclaw/agents/<agentId>/agent/auth-profiles.json` (this is what "profile keys" means in the live tests)
|
||||
- Config: `~/.openclaw/openclaw.json` (or `OPENCLAW_CONFIG_PATH`)
|
||||
- Legacy state dir: `~/.openclaw/credentials/` (copied into the staged live home when present, but not the main profile-key store)
|
||||
- Live local runs copy the active config, per-agent `auth-profiles.json` files, legacy `credentials/`, and supported external CLI auth dirs into a temp test home by default; staged live homes skip `workspace/` and `sandboxes/`, and `agents.*.workspace` / `agentDir` path overrides are stripped so probes stay off your real host workspace.
|
||||
@@ -584,4 +584,4 @@ request. Plugin dependencies are expected to be present before runtime load.
|
||||
|
||||
## Related
|
||||
|
||||
- [Testing](/help/testing) — unit, integration, QA, and Docker suites
|
||||
- [Testing](/help/testing) - unit, integration, QA, and Docker suites
|
||||
|
||||
@@ -6,13 +6,11 @@ read_when:
|
||||
- Enabling or auditing default skills
|
||||
---
|
||||
|
||||
# AGENTS.md - OpenClaw Personal Assistant (default)
|
||||
|
||||
## First run (recommended)
|
||||
|
||||
OpenClaw uses a dedicated workspace directory for the agent. Default: `~/.openclaw/workspace` (configurable via `agents.defaults.workspace`).
|
||||
|
||||
1. Create the workspace (if it doesn’t already exist):
|
||||
1. Create the workspace (if it doesn't already exist):
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.openclaw/workspace
|
||||
@@ -42,9 +40,9 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md
|
||||
|
||||
## Safety defaults
|
||||
|
||||
- Don’t dump directories or secrets into chat.
|
||||
- Don’t run destructive commands unless explicitly asked.
|
||||
- Don’t send partial/streaming replies to external messaging surfaces (only final replies).
|
||||
- Don't dump directories or secrets into chat.
|
||||
- Don't run destructive commands unless explicitly asked.
|
||||
- Don't send partial/streaming replies to external messaging surfaces (only final replies).
|
||||
|
||||
## Session start (required)
|
||||
|
||||
@@ -60,8 +58,8 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md
|
||||
|
||||
## Shared spaces (recommended)
|
||||
|
||||
- You’re not the user’s voice; be careful in group chats or public channels.
|
||||
- Don’t share private data, contact info, or internal notes.
|
||||
- You're not the user's voice; be careful in group chats or public channels.
|
||||
- Don't share private data, contact info, or internal notes.
|
||||
|
||||
## Memory system (recommended)
|
||||
|
||||
@@ -74,12 +72,12 @@ cp docs/reference/AGENTS.default.md ~/.openclaw/workspace/AGENTS.md
|
||||
|
||||
## Tools and skills
|
||||
|
||||
- Tools live in skills; follow each skill’s `SKILL.md` when you need it.
|
||||
- Tools live in skills; follow each skill's `SKILL.md` when you need it.
|
||||
- Keep environment-specific notes in `TOOLS.md` (Notes for Skills).
|
||||
|
||||
## Backup tip (recommended)
|
||||
|
||||
If you treat this workspace as Clawd’s “memory”, make it a git repo (ideally private) so `AGENTS.md` and your memory files are backed up.
|
||||
If you treat this workspace as Clawd's "memory", make it a git repo (ideally private) so `AGENTS.md` and your memory files are backed up.
|
||||
|
||||
```bash
|
||||
cd ~/.openclaw/workspace
|
||||
@@ -97,30 +95,30 @@ git commit -m "Add Clawd workspace"
|
||||
|
||||
## Core skills (enable in Settings → Skills)
|
||||
|
||||
- **mcporter** — Tool server runtime/CLI for managing external skill backends.
|
||||
- **Peekaboo** — Fast macOS screenshots with optional AI vision analysis.
|
||||
- **camsnap** — Capture frames, clips, or motion alerts from RTSP/ONVIF security cams.
|
||||
- **oracle** — OpenAI-ready agent CLI with session replay and browser control.
|
||||
- **eightctl** — Control your sleep, from the terminal.
|
||||
- **imsg** — Send, read, stream iMessage & SMS.
|
||||
- **wacli** — WhatsApp CLI: sync, search, send.
|
||||
- **discord** — Discord actions: react, stickers, polls. Use `user:<id>` or `channel:<id>` targets (bare numeric ids are ambiguous).
|
||||
- **gog** — Google Suite CLI: Gmail, Calendar, Drive, Contacts.
|
||||
- **spotify-player** — Terminal Spotify client to search/queue/control playback.
|
||||
- **sag** — ElevenLabs speech with mac-style say UX; streams to speakers by default.
|
||||
- **Sonos CLI** — Control Sonos speakers (discover/status/playback/volume/grouping) from scripts.
|
||||
- **blucli** — Play, group, and automate BluOS players from scripts.
|
||||
- **OpenHue CLI** — Philips Hue lighting control for scenes and automations.
|
||||
- **OpenAI Whisper** — Local speech-to-text for quick dictation and voicemail transcripts.
|
||||
- **Gemini CLI** — Google Gemini models from the terminal for fast Q&A.
|
||||
- **agent-tools** — Utility toolkit for automations and helper scripts.
|
||||
- **mcporter** - Tool server runtime/CLI for managing external skill backends.
|
||||
- **Peekaboo** - Fast macOS screenshots with optional AI vision analysis.
|
||||
- **camsnap** - Capture frames, clips, or motion alerts from RTSP/ONVIF security cams.
|
||||
- **oracle** - OpenAI-ready agent CLI with session replay and browser control.
|
||||
- **eightctl** - Control your sleep, from the terminal.
|
||||
- **imsg** - Send, read, stream iMessage & SMS.
|
||||
- **wacli** - WhatsApp CLI: sync, search, send.
|
||||
- **discord** - Discord actions: react, stickers, polls. Use `user:<id>` or `channel:<id>` targets (bare numeric ids are ambiguous).
|
||||
- **gog** - Google Suite CLI: Gmail, Calendar, Drive, Contacts.
|
||||
- **spotify-player** - Terminal Spotify client to search/queue/control playback.
|
||||
- **sag** - ElevenLabs speech with mac-style say UX; streams to speakers by default.
|
||||
- **Sonos CLI** - Control Sonos speakers (discover/status/playback/volume/grouping) from scripts.
|
||||
- **blucli** - Play, group, and automate BluOS players from scripts.
|
||||
- **OpenHue CLI** - Philips Hue lighting control for scenes and automations.
|
||||
- **OpenAI Whisper** - Local speech-to-text for quick dictation and voicemail transcripts.
|
||||
- **Gemini CLI** - Google Gemini models from the terminal for fast Q&A.
|
||||
- **agent-tools** - Utility toolkit for automations and helper scripts.
|
||||
|
||||
## Usage notes
|
||||
|
||||
- Prefer the `openclaw` CLI for scripting; mac app handles permissions.
|
||||
- Run installs from the Skills tab; it hides the button if a binary is already present.
|
||||
- Keep heartbeats enabled so the assistant can schedule reminders, monitor inboxes, and trigger camera captures.
|
||||
- Canvas UI runs full-screen with native overlays. Avoid placing critical controls in the top-left/top-right/bottom edges; add explicit gutters in the layout and don’t rely on safe-area insets.
|
||||
- Canvas UI runs full-screen with native overlays. Avoid placing critical controls in the top-left/top-right/bottom edges; add explicit gutters in the layout and don't rely on safe-area insets.
|
||||
- For browser-driven verification, use `openclaw browser` (tabs/status/screenshot) with the OpenClaw-managed Chrome profile.
|
||||
- For DOM inspection, use `openclaw browser eval|query|dom|snapshot` (and `--json`/`--out` when you need machine output).
|
||||
- For interactions, use `openclaw browser click|type|hover|drag|select|upload|press|wait|navigate|back|evaluate|run` (click/type require snapshot refs; use `evaluate` for CSS selectors).
|
||||
|
||||
@@ -52,7 +52,7 @@ or sign in with OpenAI Codex OAuth.
|
||||
_"Generate an image of a friendly robot mascot."_
|
||||
|
||||
The agent calls `image_generate` automatically. No tool allow-listing
|
||||
needed — it is enabled by default when a provider is available.
|
||||
needed - it is enabled by default when a provider is available.
|
||||
|
||||
</Step>
|
||||
</Steps>
|
||||
@@ -110,10 +110,10 @@ Use `action: "list"` to inspect available providers and models at runtime:
|
||||
| Capability | ComfyUI | DeepInfra | fal | Google | MiniMax | OpenAI | Vydra | xAI |
|
||||
| --------------------- | ------------------ | --------- | ----------------- | -------------- | --------------------- | -------------- | ----- | -------------- |
|
||||
| Generate (max count) | Workflow-defined | 4 | 4 | 4 | 9 | 4 | 1 | 4 |
|
||||
| Edit / reference | 1 image (workflow) | 1 image | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | — | Up to 5 images |
|
||||
| Size control | — | ✓ | ✓ | ✓ | — | Up to 4K | — | — |
|
||||
| Aspect ratio | — | — | ✓ (generate only) | ✓ | ✓ | — | — | ✓ |
|
||||
| Resolution (1K/2K/4K) | — | — | ✓ | ✓ | — | — | — | 1K, 2K |
|
||||
| Edit / reference | 1 image (workflow) | 1 image | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | - | Up to 5 images |
|
||||
| Size control | - | ✓ | ✓ | ✓ | - | Up to 4K | - | - |
|
||||
| Aspect ratio | - | - | ✓ (generate only) | ✓ | ✓ | - | - | ✓ |
|
||||
| Resolution (1K/2K/4K) | - | - | ✓ | ✓ | - | - | - | 1K, 2K |
|
||||
|
||||
## Tool parameters
|
||||
|
||||
@@ -150,7 +150,7 @@ Use `action: "list"` to inspect available providers and models at runtime:
|
||||
Background hint when the provider supports it. Use `transparent` with
|
||||
`outputFormat: "png"` or `"webp"` for transparency-capable providers.
|
||||
</ParamField>
|
||||
<ParamField path="count" type="number">Number of images to generate (1–4).</ParamField>
|
||||
<ParamField path="count" type="number">Number of images to generate (1-4).</ParamField>
|
||||
<ParamField path="timeoutMs" type="number">Optional provider request timeout in milliseconds.</ParamField>
|
||||
<ParamField path="filename" type="string">Output filename hint.</ParamField>
|
||||
<ParamField path="openai" type="object">
|
||||
@@ -196,7 +196,7 @@ OpenClaw tries providers in this order:
|
||||
1. **`model` parameter** from the tool call (if the agent specifies one).
|
||||
2. **`imageGenerationModel.primary`** from config.
|
||||
3. **`imageGenerationModel.fallbacks`** in order.
|
||||
4. **Auto-detection** — auth-backed provider defaults only:
|
||||
4. **Auto-detection** - auth-backed provider defaults only:
|
||||
- current default provider first;
|
||||
- remaining registered image-generation providers in provider-id order.
|
||||
|
||||
@@ -248,7 +248,7 @@ OpenAI, OpenRouter, Google, and xAI support up to 5 reference images via the
|
||||
image request through the Codex Responses backend. Legacy Codex base
|
||||
URLs such as `https://chatgpt.com/backend-api` are canonicalized to
|
||||
`https://chatgpt.com/backend-api/codex` for image requests. OpenClaw
|
||||
does **not** silently fall back to `OPENAI_API_KEY` for that request —
|
||||
does **not** silently fall back to `OPENAI_API_KEY` for that request -
|
||||
to force direct OpenAI Images API routing, configure
|
||||
`models.providers.openai` explicitly with an API key, custom base URL,
|
||||
or Azure endpoint.
|
||||
@@ -398,13 +398,13 @@ as ignored for them.
|
||||
|
||||
## Related
|
||||
|
||||
- [Tools overview](/tools) — all available agent tools
|
||||
- [ComfyUI](/providers/comfy) — local ComfyUI and Comfy Cloud workflow setup
|
||||
- [fal](/providers/fal) — fal image and video provider setup
|
||||
- [Google (Gemini)](/providers/google) — Gemini image provider setup
|
||||
- [MiniMax](/providers/minimax) — MiniMax image provider setup
|
||||
- [OpenAI](/providers/openai) — OpenAI Images provider setup
|
||||
- [Vydra](/providers/vydra) — Vydra image, video, and speech setup
|
||||
- [xAI](/providers/xai) — Grok image, video, search, code execution, and TTS setup
|
||||
- [Configuration reference](/gateway/config-agents#agent-defaults) — `imageGenerationModel` config
|
||||
- [Models](/concepts/models) — model configuration and failover
|
||||
- [Tools overview](/tools) - all available agent tools
|
||||
- [ComfyUI](/providers/comfy) - local ComfyUI and Comfy Cloud workflow setup
|
||||
- [fal](/providers/fal) - fal image and video provider setup
|
||||
- [Google (Gemini)](/providers/google) - Gemini image provider setup
|
||||
- [MiniMax](/providers/minimax) - MiniMax image provider setup
|
||||
- [OpenAI](/providers/openai) - OpenAI Images provider setup
|
||||
- [Vydra](/providers/vydra) - Vydra image, video, and speech setup
|
||||
- [xAI](/providers/xai) - Grok image, video, search, code execution, and TTS setup
|
||||
- [Configuration reference](/gateway/config-agents#agent-defaults) - `imageGenerationModel` config
|
||||
- [Models](/concepts/models) - model configuration and failover
|
||||
|
||||
@@ -22,9 +22,9 @@ provider API key or configure `agents.defaults.videoGenerationModel`.
|
||||
|
||||
OpenClaw treats video generation as three runtime modes:
|
||||
|
||||
- `generate` — text-to-video requests with no reference media.
|
||||
- `imageToVideo` — request includes one or more reference images.
|
||||
- `videoToVideo` — request includes one or more reference videos.
|
||||
- `generate` - text-to-video requests with no reference media.
|
||||
- `imageToVideo` - request includes one or more reference images.
|
||||
- `videoToVideo` - request includes one or more reference videos.
|
||||
|
||||
Providers can support any subset of those modes. The tool validates the
|
||||
active mode before submission and reports supported modes in `action=list`.
|
||||
@@ -109,20 +109,20 @@ generation.
|
||||
| Provider | Default model | Text | Image ref | Video ref | Auth |
|
||||
| --------------------- | ------------------------------- | :--: | ---------------------------------------------------- | ----------------------------------------------- | ---------------------------------------- |
|
||||
| Alibaba | `wan2.6-t2v` | ✓ | Yes (remote URL) | Yes (remote URL) | `MODELSTUDIO_API_KEY` |
|
||||
| BytePlus (1.0) | `seedance-1-0-pro-250528` | ✓ | Up to 2 images (I2V models only; first + last frame) | — | `BYTEPLUS_API_KEY` |
|
||||
| BytePlus Seedance 1.5 | `seedance-1-5-pro-251215` | ✓ | Up to 2 images (first + last frame via role) | — | `BYTEPLUS_API_KEY` |
|
||||
| BytePlus (1.0) | `seedance-1-0-pro-250528` | ✓ | Up to 2 images (I2V models only; first + last frame) | - | `BYTEPLUS_API_KEY` |
|
||||
| BytePlus Seedance 1.5 | `seedance-1-5-pro-251215` | ✓ | Up to 2 images (first + last frame via role) | - | `BYTEPLUS_API_KEY` |
|
||||
| BytePlus Seedance 2.0 | `dreamina-seedance-2-0-260128` | ✓ | Up to 9 reference images | Up to 3 videos | `BYTEPLUS_API_KEY` |
|
||||
| ComfyUI | `workflow` | ✓ | 1 image | — | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` |
|
||||
| DeepInfra | `Pixverse/Pixverse-T2V` | ✓ | — | — | `DEEPINFRA_API_KEY` |
|
||||
| ComfyUI | `workflow` | ✓ | 1 image | - | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` |
|
||||
| DeepInfra | `Pixverse/Pixverse-T2V` | ✓ | - | - | `DEEPINFRA_API_KEY` |
|
||||
| fal | `fal-ai/minimax/video-01-live` | ✓ | 1 image; up to 9 with Seedance reference-to-video | Up to 3 videos with Seedance reference-to-video | `FAL_KEY` |
|
||||
| Google | `veo-3.1-fast-generate-preview` | ✓ | 1 image | 1 video | `GEMINI_API_KEY` |
|
||||
| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | — | `MINIMAX_API_KEY` or MiniMax OAuth |
|
||||
| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | - | `MINIMAX_API_KEY` or MiniMax OAuth |
|
||||
| OpenAI | `sora-2` | ✓ | 1 image | 1 video | `OPENAI_API_KEY` |
|
||||
| OpenRouter | `google/veo-3.1-fast` | ✓ | Up to 4 images (first/last frame or references) | — | `OPENROUTER_API_KEY` |
|
||||
| OpenRouter | `google/veo-3.1-fast` | ✓ | Up to 4 images (first/last frame or references) | - | `OPENROUTER_API_KEY` |
|
||||
| Qwen | `wan2.6-t2v` | ✓ | Yes (remote URL) | Yes (remote URL) | `QWEN_API_KEY` |
|
||||
| Runway | `gen4.5` | ✓ | 1 image | 1 video | `RUNWAYML_API_SECRET` |
|
||||
| Together | `Wan-AI/Wan2.2-T2V-A14B` | ✓ | 1 image | — | `TOGETHER_API_KEY` |
|
||||
| Vydra | `veo3` | ✓ | 1 image (`kling`) | — | `VYDRA_API_KEY` |
|
||||
| Together | `Wan-AI/Wan2.2-T2V-A14B` | ✓ | 1 image | - | `TOGETHER_API_KEY` |
|
||||
| Vydra | `veo3` | ✓ | 1 image (`kling`) | - | `VYDRA_API_KEY` |
|
||||
| xAI | `grok-imagine-video` | ✓ | 1 first-frame image or up to 7 `reference_image`s | 1 video | `XAI_API_KEY` |
|
||||
|
||||
Some providers accept additional or alternate API key env vars. See
|
||||
@@ -139,18 +139,18 @@ the shared live sweep:
|
||||
| Provider | `generate` | `imageToVideo` | `videoToVideo` | Shared live lanes today |
|
||||
| ---------- | :--------: | :------------: | :------------: | ---------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Alibaba | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
|
||||
| BytePlus | ✓ | ✓ | — | `generate`, `imageToVideo` |
|
||||
| ComfyUI | ✓ | ✓ | — | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
|
||||
| DeepInfra | ✓ | — | — | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract |
|
||||
| BytePlus | ✓ | ✓ | - | `generate`, `imageToVideo` |
|
||||
| ComfyUI | ✓ | ✓ | - | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
|
||||
| DeepInfra | ✓ | - | - | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract |
|
||||
| fal | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` only when using Seedance reference-to-video |
|
||||
| Google | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because the current buffer-backed Gemini/Veo sweep does not accept that input |
|
||||
| MiniMax | ✓ | ✓ | — | `generate`, `imageToVideo` |
|
||||
| MiniMax | ✓ | ✓ | - | `generate`, `imageToVideo` |
|
||||
| OpenAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because this org/input path currently needs provider-side inpaint/remix access |
|
||||
| OpenRouter | ✓ | ✓ | — | `generate`, `imageToVideo` |
|
||||
| OpenRouter | ✓ | ✓ | - | `generate`, `imageToVideo` |
|
||||
| Qwen | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
|
||||
| Runway | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` runs only when the selected model is `runway/gen4_aleph` |
|
||||
| Together | ✓ | ✓ | — | `generate`, `imageToVideo` |
|
||||
| Vydra | ✓ | ✓ | — | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
|
||||
| Together | ✓ | ✓ | - | `generate`, `imageToVideo` |
|
||||
| Vydra | ✓ | ✓ | - | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
|
||||
| xAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider currently needs a remote MP4 URL |
|
||||
|
||||
## Tool parameters
|
||||
@@ -290,10 +290,10 @@ aggregated error includes the skip reason for each.
|
||||
|
||||
OpenClaw resolves the model in this order:
|
||||
|
||||
1. **`model` tool parameter** — if the agent specifies one in the call.
|
||||
1. **`model` tool parameter** - if the agent specifies one in the call.
|
||||
2. **`videoGenerationModel.primary`** from config.
|
||||
3. **`videoGenerationModel.fallbacks`** in order.
|
||||
4. **Auto-detection** — providers that have valid auth, starting with the
|
||||
4. **Auto-detection** - providers that have valid auth, starting with the
|
||||
current default provider, then remaining providers in alphabetical
|
||||
order.
|
||||
|
||||
@@ -336,7 +336,7 @@ only the explicit `model`, `primary`, and `fallbacks` entries.
|
||||
T2V model IDs are automatically switched to the corresponding I2V
|
||||
variant when an image is provided.
|
||||
|
||||
Supported `providerOptions` keys: `seed` (number), `draft` (boolean —
|
||||
Supported `providerOptions` keys: `seed` (number), `draft` (boolean -
|
||||
forces 480p), `camera_fixed` (boolean).
|
||||
|
||||
</Accordion>
|
||||
@@ -363,7 +363,7 @@ only the explicit `model`, `primary`, and `fallbacks` entries.
|
||||
|
||||
Uses the unified `content[]` API. Supports up to 9 reference images,
|
||||
3 reference videos, and 3 reference audios. All inputs must be remote
|
||||
`https://` URLs. Set `role` on each asset — supported values:
|
||||
`https://` URLs. Set `role` on each asset - supported values:
|
||||
`"first_frame"`, `"last_frame"`, `"reference_image"`,
|
||||
`"reference_video"`, `"reference_audio"`.
|
||||
|
||||
@@ -536,7 +536,7 @@ openclaw config set agents.defaults.videoGenerationModel.primary "qwen/wan2.6-t2
|
||||
## Related
|
||||
|
||||
- [Alibaba Model Studio](/providers/alibaba)
|
||||
- [Background tasks](/automation/tasks) — task tracking for async video generation
|
||||
- [Background tasks](/automation/tasks) - task tracking for async video generation
|
||||
- [BytePlus](/concepts/model-providers#byteplus-international)
|
||||
- [ComfyUI](/providers/comfy)
|
||||
- [Configuration reference](/gateway/config-agents#agent-defaults)
|
||||
|
||||
Reference in New Issue
Block a user