mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-23 15:11:42 +00:00
136 lines
7.3 KiB
Markdown
136 lines
7.3 KiB
Markdown
---
|
|
summary: "Generate videos using configured providers such as Alibaba, OpenAI, Google, Qwen, and MiniMax"
|
|
read_when:
|
|
- Generating videos via the agent
|
|
- Configuring video generation providers and models
|
|
- Understanding the video_generate tool parameters
|
|
title: "Video Generation"
|
|
---
|
|
|
|
# Video Generation
|
|
|
|
The `video_generate` tool lets the agent create videos using your configured providers. Generated videos are delivered automatically as media attachments in the agent's reply.
|
|
|
|
<Note>
|
|
The tool only appears when at least one video-generation provider is available. If you don't see `video_generate` in your agent's tools, configure `agents.defaults.videoGenerationModel` or set up a provider API key.
|
|
</Note>
|
|
|
|
## Quick start
|
|
|
|
1. Set an API key for at least one provider (for example `OPENAI_API_KEY`, `GEMINI_API_KEY`, `MODELSTUDIO_API_KEY`, or `QWEN_API_KEY`).
|
|
2. Optionally set your preferred model:
|
|
|
|
```json5
|
|
{
|
|
agents: {
|
|
defaults: {
|
|
videoGenerationModel: {
|
|
primary: "qwen/wan2.6-t2v",
|
|
},
|
|
},
|
|
},
|
|
}
|
|
```
|
|
|
|
3. Ask the agent: _"Generate a 5-second cinematic video of a friendly lobster surfing at sunset."_
|
|
|
|
The agent calls `video_generate` automatically. No tool allow-listing needed — it's enabled by default when a provider is available.
|
|
|
|
## Supported providers
|
|
|
|
| Provider | Default model | Reference inputs | API key |
|
|
| -------- | ------------------------------- | ------------------ | ---------------------------------------------------------- |
|
|
| Alibaba | `wan2.6-t2v` | Yes, remote URLs | `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY`, `QWEN_API_KEY` |
|
|
| BytePlus | `seedance-1-0-lite-t2v-250428` | 1 image | `BYTEPLUS_API_KEY` |
|
|
| fal | `fal-ai/minimax/video-01-live` | 1 image | `FAL_KEY` |
|
|
| Google | `veo-3.1-fast-generate-preview` | 1 image or 1 video | `GEMINI_API_KEY`, `GOOGLE_API_KEY` |
|
|
| MiniMax | `MiniMax-Hailuo-2.3` | 1 image | `MINIMAX_API_KEY` |
|
|
| OpenAI | `sora-2` | 1 image or 1 video | `OPENAI_API_KEY` |
|
|
| Qwen | `wan2.6-t2v` | Yes, remote URLs | `QWEN_API_KEY`, `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY` |
|
|
| Together | `Wan-AI/Wan2.2-T2V-A14B` | 1 image | `TOGETHER_API_KEY` |
|
|
| xAI | `grok-imagine-video` | 1 image or 1 video | `XAI_API_KEY` |
|
|
|
|
Use `action: "list"` to inspect available providers and models at runtime:
|
|
|
|
```
|
|
/tool video_generate action=list
|
|
```
|
|
|
|
## Tool parameters
|
|
|
|
| Parameter | Type | Description |
|
|
| ----------------- | -------- | ------------------------------------------------------------------------------------- |
|
|
| `prompt` | string | Video generation prompt (required for `action: "generate"`) |
|
|
| `action` | string | `"generate"` (default) or `"list"` to inspect providers |
|
|
| `model` | string | Provider/model override, e.g. `qwen/wan2.6-t2v` |
|
|
| `image` | string | Single reference image path or URL |
|
|
| `images` | string[] | Multiple reference images (up to 5) |
|
|
| `video` | string | Single reference video path or URL |
|
|
| `videos` | string[] | Multiple reference videos (up to 4) |
|
|
| `size` | string | Size hint when the provider supports it |
|
|
| `aspectRatio` | string | Aspect ratio: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
|
|
| `resolution` | string | Resolution hint: `480P`, `720P`, or `1080P` |
|
|
| `durationSeconds` | number | Target duration in seconds |
|
|
| `audio` | boolean | Enable generated audio when the provider supports it |
|
|
| `watermark` | boolean | Toggle provider watermarking when supported |
|
|
| `filename` | string | Output filename hint |
|
|
|
|
Not all providers support all parameters. The tool validates provider capability limits before it submits the request.
|
|
|
|
## Configuration
|
|
|
|
### Model selection
|
|
|
|
```json5
|
|
{
|
|
agents: {
|
|
defaults: {
|
|
videoGenerationModel: {
|
|
primary: "qwen/wan2.6-t2v",
|
|
fallbacks: ["qwen/wan2.6-r2v-flash"],
|
|
},
|
|
},
|
|
},
|
|
}
|
|
```
|
|
|
|
### Provider selection order
|
|
|
|
When generating a video, OpenClaw tries providers in this order:
|
|
|
|
1. **`model` parameter** from the tool call (if the agent specifies one)
|
|
2. **`videoGenerationModel.primary`** from config
|
|
3. **`videoGenerationModel.fallbacks`** in order
|
|
4. **Auto-detection** — uses auth-backed provider defaults only:
|
|
- current default provider first
|
|
- remaining registered video-generation providers in provider-id order
|
|
|
|
If a provider fails, the next candidate is tried automatically. If all fail, the error includes details from each attempt.
|
|
|
|
## Provider notes
|
|
|
|
- Alibaba uses the DashScope / Model Studio async video endpoint and currently requires remote `http(s)` URLs for reference assets.
|
|
- Google uses Gemini/Veo and supports a single image or video reference input.
|
|
- MiniMax, Together, BytePlus, and fal currently support a single image reference input.
|
|
- OpenAI uses the native video endpoint and currently defaults to `sora-2`.
|
|
- Qwen supports image/video references, but the upstream DashScope video endpoint currently requires remote `http(s)` URLs for those references.
|
|
- xAI uses the native xAI video API and supports text-to-video, image-to-video, and remote video edit/extend flows.
|
|
|
|
## Qwen reference inputs
|
|
|
|
The bundled Qwen provider supports text-to-video plus image/video reference modes, but the upstream DashScope video endpoint currently requires **remote http(s) URLs** for reference inputs. Local file paths and uploaded buffers are rejected up front instead of being silently ignored.
|
|
|
|
## Related
|
|
|
|
- [Tools Overview](/tools) — all available agent tools
|
|
- [Alibaba Model Studio](/providers/alibaba) — direct Wan provider setup
|
|
- [Google (Gemini)](/providers/google) — Veo provider setup
|
|
- [MiniMax](/providers/minimax) — Hailuo provider setup
|
|
- [OpenAI](/providers/openai) — Sora provider setup
|
|
- [Qwen](/providers/qwen) — Qwen-specific setup and limits
|
|
- [Qwen / Model Studio](/providers/qwen_modelstudio) — endpoint-level DashScope detail
|
|
- [Together AI](/providers/together) — Together Wan provider setup
|
|
- [xAI](/providers/xai) — Grok video provider setup
|
|
- [Configuration Reference](/gateway/configuration-reference#agent-defaults) — `videoGenerationModel` config
|
|
- [Models](/concepts/models) — model configuration and failover
|