openclaw/docs/tools/video-generation.md at f30c087fdf9ff962a2f6fa968318b5e5db4f9df3

vultr/openclaw

Fork 0

mirror of https://github.com/openclaw/openclaw.git synced 2026-04-12 09:41:11 +00:00

Files

Peter Steinberger f30c087fdf docs(providers): add generation setup pages

2026-04-05 23:21:14 +01:00

7.3 KiB

Raw Blame History

summary, read_when, title

summary

read_when

title

Generate videos using configured providers such as Alibaba, OpenAI, Google, Qwen, and MiniMax

Generating videos via the agent

Configuring video generation providers and models

Understanding the video_generate tool parameters

Video Generation

The video_generate tool lets the agent create videos using your configured providers. Generated videos are delivered automatically as media attachments in the agent's reply.

The tool only appears when at least one video-generation provider is available. If you don't see `video_generate` in your agent's tools, configure `agents.defaults.videoGenerationModel` or set up a provider API key.

Quick start

Set an API key for at least one provider (for example OPENAI_API_KEY, GEMINI_API_KEY, MODELSTUDIO_API_KEY, or QWEN_API_KEY).
Optionally set your preferred model:

{
  agents: {
    defaults: {
      videoGenerationModel: {
        primary: "qwen/wan2.6-t2v",
      },
    },
  },
}

Ask the agent: "Generate a 5-second cinematic video of a friendly lobster surfing at sunset."

The agent calls video_generate automatically. No tool allow-listing needed — it's enabled by default when a provider is available.

Supported providers

Provider	Default model	Reference inputs	API key
Alibaba	`wan2.6-t2v`	Yes, remote URLs	`MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY`, `QWEN_API_KEY`
BytePlus	`seedance-1-0-lite-t2v-250428`	1 image	`BYTEPLUS_API_KEY`
fal	`fal-ai/minimax/video-01-live`	1 image	`FAL_KEY`
Google	`veo-3.1-fast-generate-preview`	1 image or 1 video	`GEMINI_API_KEY`, `GOOGLE_API_KEY`
MiniMax	`MiniMax-Hailuo-2.3`	1 image	`MINIMAX_API_KEY`
OpenAI	`sora-2`	1 image or 1 video	`OPENAI_API_KEY`
Qwen	`wan2.6-t2v`	Yes, remote URLs	`QWEN_API_KEY`, `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY`
Together	`Wan-AI/Wan2.2-T2V-A14B`	1 image	`TOGETHER_API_KEY`
xAI	`grok-imagine-video`	1 image or 1 video	`XAI_API_KEY`

Use action: "list" to inspect available providers and models at runtime:

/tool video_generate action=list

Tool parameters

Parameter	Type	Description
`prompt`	string	Video generation prompt (required for `action: "generate"`)
`action`	string	`"generate"` (default) or `"list"` to inspect providers
`model`	string	Provider/model override, e.g. `qwen/wan2.6-t2v`
`image`	string	Single reference image path or URL
`images`	string[]	Multiple reference images (up to 5)
`video`	string	Single reference video path or URL
`videos`	string[]	Multiple reference videos (up to 4)
`size`	string	Size hint when the provider supports it
`aspectRatio`	string	Aspect ratio: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`
`resolution`	string	Resolution hint: `480P`, `720P`, or `1080P`
`durationSeconds`	number	Target duration in seconds
`audio`	boolean	Enable generated audio when the provider supports it
`watermark`	boolean	Toggle provider watermarking when supported
`filename`	string	Output filename hint

Not all providers support all parameters. The tool validates provider capability limits before it submits the request.

Configuration

Model selection

{
  agents: {
    defaults: {
      videoGenerationModel: {
        primary: "qwen/wan2.6-t2v",
        fallbacks: ["qwen/wan2.6-r2v-flash"],
      },
    },
  },
}

Provider selection order

When generating a video, OpenClaw tries providers in this order:

model parameter from the tool call (if the agent specifies one)
videoGenerationModel.primary from config
videoGenerationModel.fallbacks in order
Auto-detection — uses auth-backed provider defaults only:
- current default provider first
- remaining registered video-generation providers in provider-id order

If a provider fails, the next candidate is tried automatically. If all fail, the error includes details from each attempt.

Provider notes

Alibaba uses the DashScope / Model Studio async video endpoint and currently requires remote http(s) URLs for reference assets.
Google uses Gemini/Veo and supports a single image or video reference input.
MiniMax, Together, BytePlus, and fal currently support a single image reference input.
OpenAI uses the native video endpoint and currently defaults to sora-2.
Qwen supports image/video references, but the upstream DashScope video endpoint currently requires remote http(s) URLs for those references.
xAI uses the native xAI video API and supports text-to-video, image-to-video, and remote video edit/extend flows.

Qwen reference inputs

The bundled Qwen provider supports text-to-video plus image/video reference modes, but the upstream DashScope video endpoint currently requires remote http(s) URLs for reference inputs. Local file paths and uploaded buffers are rejected up front instead of being silently ignored.

Tools Overview — all available agent tools
Alibaba Model Studio — direct Wan provider setup
Google (Gemini) — Veo provider setup
MiniMax — Hailuo provider setup
OpenAI — Sora provider setup
Qwen — Qwen-specific setup and limits
Qwen / Model Studio — endpoint-level DashScope detail
Together AI — Together Wan provider setup
xAI — Grok video provider setup
Configuration Reference — videoGenerationModel config
Models — model configuration and failover

7.3 KiB Raw Blame History