diff --git a/docs/providers/runway.md b/docs/providers/runway.md
index 97d82b4120f..3061a497f7d 100644
--- a/docs/providers/runway.md
+++ b/docs/providers/runway.md
@@ -11,9 +11,9 @@ read_when:
OpenClaw ships a bundled `runway` provider for hosted video generation.
-- Provider: `runway`
-- Auth: `RUNWAYML_API_SECRET` (canonical; `RUNWAY_API_KEY` also works)
-- API: Runway task-based video generation API
+- Provider id: `runway`
+- Auth: `RUNWAYML_API_SECRET` (canonical) or `RUNWAY_API_KEY`
+- API: Runway task-based video generation (`GET /v1/tasks/{id}` polling)
## Quick start
@@ -23,33 +23,27 @@ OpenClaw ships a bundled `runway` provider for hosted video generation.
openclaw onboard --auth-choice runway-api-key
```
-2. Set a default video model:
+2. Set Runway as the default video provider:
-```json5
-{
- agents: {
- defaults: {
- videoGenerationModel: {
- primary: "runway/gen4.5",
- },
- },
- },
-}
+```bash
+openclaw config set agents.defaults.videoGenerationModel.primary "runway/gen4.5"
```
-## Video generation
+3. Ask the agent to generate a video. Runway will be used automatically.
-The bundled `runway` video-generation provider defaults to `runway/gen4.5`.
+## Supported modes
-- Modes: text-to-video, single-image image-to-video, and single-video video-to-video
-- Runtime: async task submit + poll via `GET /v1/tasks/{id}`
-- Agent sessions: `video_generate` starts a background task, and later calls in the same session now return active-task status instead of spawning a duplicate run
-- Status lookup: `video_generate action=status`
-- Local image/video references: supported via data URIs
-- Current video-to-video caveat: OpenClaw currently requires `runway/gen4_aleph` for video inputs
-- Current text-to-video caveat: OpenClaw currently exposes `16:9` and `9:16` for text-only runs
+| Mode | Model | Reference input |
+| -------------- | ------------------ | ----------------------- |
+| Text-to-video | `gen4.5` (default) | None |
+| Image-to-video | `gen4.5` | 1 local or remote image |
+| Video-to-video | `gen4_aleph` | 1 local or remote video |
-To use Runway as the default video provider:
+- Local image and video references are supported via data URIs.
+- Video-to-video currently requires `runway/gen4_aleph` specifically.
+- Text-only runs currently expose `16:9` and `9:16` aspect ratios.
+
+## Configuration
```json5
{
@@ -65,5 +59,5 @@ To use Runway as the default video provider:
## Related
-- [Video Generation](/tools/video-generation)
+- [Video Generation](/tools/video-generation) -- shared tool parameters, provider selection, and async behavior
- [Configuration Reference](/gateway/configuration-reference#agent-defaults)
diff --git a/docs/tools/video-generation.md b/docs/tools/video-generation.md
index d5b1b4e43ce..6e5ae578b00 100644
--- a/docs/tools/video-generation.md
+++ b/docs/tools/video-generation.md
@@ -1,5 +1,5 @@
---
-summary: "Generate videos using configured providers such as Alibaba, OpenAI, Google, Qwen, MiniMax, and Runway"
+summary: "Generate videos from text, images, or existing videos using 10 provider backends"
read_when:
- Generating videos via the agent
- Configuring video generation providers and models
@@ -9,94 +9,150 @@ title: "Video Generation"
# Video Generation
-The `video_generate` tool lets the agent create videos using your configured providers. In agent sessions, OpenClaw starts video generation as a background task, tracks it in the task ledger, then wakes the agent again when the clip is ready so the agent can post the finished video back into the original channel.
+OpenClaw agents can generate videos from text prompts, reference images, or existing videos. Ten provider backends are supported, each with different model options, input modes, and feature sets. The agent picks the right provider automatically based on your configuration and available API keys.
-The tool only appears when at least one video-generation provider is available. If you don't see `video_generate` in your agent's tools, configure `agents.defaults.videoGenerationModel` or set up a provider API key.
-
-
-
-In agent sessions, `video_generate` returns immediately with a task id/run id. The actual provider job continues in the background. When it finishes, OpenClaw wakes the same session with an internal completion event so the agent can send a normal follow-up plus the generated video attachment.
+The `video_generate` tool only appears when at least one video-generation provider is available. If you do not see it in your agent tools, set a provider API key or configure `agents.defaults.videoGenerationModel`.
## Quick start
-1. Set an API key for at least one provider (for example `OPENAI_API_KEY`, `GEMINI_API_KEY`, `MODELSTUDIO_API_KEY`, `QWEN_API_KEY`, or `RUNWAYML_API_SECRET`).
-2. Optionally set your preferred model:
+1. Set an API key for any supported provider:
+
+```bash
+export GEMINI_API_KEY="your-key"
+```
+
+2. Optionally pin a default model:
+
+```bash
+openclaw config set agents.defaults.videoGenerationModel.primary "google/veo-3.1-fast-generate-preview"
+```
+
+3. Ask the agent:
+
+> Generate a 5-second cinematic video of a friendly lobster surfing at sunset.
+
+The agent calls `video_generate` automatically. No tool allowlisting is needed.
+
+## What happens when you generate a video
+
+Video generation is asynchronous. When the agent calls `video_generate` in a session:
+
+1. OpenClaw submits the request to the provider and immediately returns a task ID.
+2. The provider processes the job in the background (typically 30 seconds to 5 minutes depending on the provider and resolution).
+3. When the video is ready, OpenClaw wakes the same session with an internal completion event.
+4. The agent posts the finished video back into the original conversation.
+
+While a job is in flight, duplicate `video_generate` calls in the same session return the current task status instead of starting another generation. Use `openclaw tasks list` or `openclaw tasks show ` to check progress from the CLI.
+
+Outside of session-backed agent runs (for example, direct tool invocations), the tool falls back to inline generation and returns the final media path in the same turn.
+
+## Supported providers
+
+| Provider | Default model | Text | Image ref | Video ref | API key |
+| -------- | ------------------------------- | ---- | ---------------- | ---------------- | --------------------- |
+| Alibaba | `wan2.6-t2v` | Yes | Yes (remote URL) | Yes (remote URL) | `MODELSTUDIO_API_KEY` |
+| BytePlus | `seedance-1-0-lite-t2v-250428` | Yes | 1 image | No | `BYTEPLUS_API_KEY` |
+| fal | `fal-ai/minimax/video-01-live` | Yes | 1 image | No | `FAL_KEY` |
+| Google | `veo-3.1-fast-generate-preview` | Yes | 1 image | 1 video | `GEMINI_API_KEY` |
+| MiniMax | `MiniMax-Hailuo-2.3` | Yes | 1 image | No | `MINIMAX_API_KEY` |
+| OpenAI | `sora-2` | Yes | 1 image | 1 video | `OPENAI_API_KEY` |
+| Qwen | `wan2.6-t2v` | Yes | Yes (remote URL) | Yes (remote URL) | `QWEN_API_KEY` |
+| Runway | `gen4.5` | Yes | 1 image | 1 video | `RUNWAYML_API_SECRET` |
+| Together | `Wan-AI/Wan2.2-T2V-A14B` | Yes | 1 image | No | `TOGETHER_API_KEY` |
+| xAI | `grok-imagine-video` | Yes | 1 image | 1 video | `XAI_API_KEY` |
+
+Some providers accept additional or alternate API key env vars. See individual [provider pages](#related) for details.
+
+Run `video_generate action=list` to inspect available providers and models at runtime.
+
+## Tool parameters
+
+### Required
+
+| Parameter | Type | Description |
+| --------- | ------ | ----------------------------------------------------------------------------- |
+| `prompt` | string | Text description of the video to generate (required for `action: "generate"`) |
+
+### Content inputs
+
+| Parameter | Type | Description |
+| --------- | -------- | ------------------------------------ |
+| `image` | string | Single reference image (path or URL) |
+| `images` | string[] | Multiple reference images (up to 5) |
+| `video` | string | Single reference video (path or URL) |
+| `videos` | string[] | Multiple reference videos (up to 4) |
+
+### Style controls
+
+| Parameter | Type | Description |
+| ----------------- | ------- | ------------------------------------------------------------------------ |
+| `aspectRatio` | string | `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
+| `resolution` | string | `480P`, `720P`, or `1080P` |
+| `durationSeconds` | number | Target duration in seconds (rounded to nearest provider-supported value) |
+| `size` | string | Size hint when the provider supports it |
+| `audio` | boolean | Enable generated audio when supported |
+| `watermark` | boolean | Toggle provider watermarking when supported |
+
+### Advanced
+
+| Parameter | Type | Description |
+| ---------- | ------ | ----------------------------------------------- |
+| `action` | string | `"generate"` (default), `"status"`, or `"list"` |
+| `model` | string | Provider/model override (e.g. `runway/gen4.5`) |
+| `filename` | string | Output filename hint |
+
+Not all providers support all parameters. Unsupported overrides are ignored on a best-effort basis and reported as warnings in the tool result. Hard capability limits (such as too many reference inputs) fail before submission.
+
+## Actions
+
+- **generate** (default) -- create a video from the given prompt and optional reference inputs.
+- **status** -- check the state of the in-flight video task for the current session without starting a new one.
+- **list** -- show available providers, models, and their capabilities.
+
+## Model selection
+
+When generating a video, OpenClaw resolves the model in this order:
+
+1. **`model` tool parameter** -- if the agent specifies one in the call.
+2. **`videoGenerationModel.primary`** -- from config.
+3. **`videoGenerationModel.fallbacks`** -- tried in order.
+4. **Auto-detection** -- uses providers that have valid auth, starting with the current default provider, then remaining providers in alphabetical order.
+
+If a provider fails, the next candidate is tried automatically. If all candidates fail, the error includes details from each attempt.
```json5
{
agents: {
defaults: {
videoGenerationModel: {
- primary: "qwen/wan2.6-t2v",
+ primary: "google/veo-3.1-fast-generate-preview",
+ fallbacks: ["runway/gen4.5", "qwen/wan2.6-t2v"],
},
},
},
}
```
-3. Ask the agent: _"Generate a 5-second cinematic video of a friendly lobster surfing at sunset."_
+## Provider notes
-The agent calls `video_generate` automatically. No tool allow-listing needed — it's enabled by default when a provider is available.
-
-For direct synchronous contexts without a session-backed agent run, the tool still falls back to inline generation and returns the final media path in the tool result.
-
-## Supported providers
-
-| Provider | Default model | Reference inputs | API key |
-| -------- | ------------------------------- | ------------------ | ---------------------------------------------------------- |
-| Alibaba | `wan2.6-t2v` | Yes, remote URLs | `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY`, `QWEN_API_KEY` |
-| BytePlus | `seedance-1-0-lite-t2v-250428` | 1 image | `BYTEPLUS_API_KEY` |
-| fal | `fal-ai/minimax/video-01-live` | 1 image | `FAL_KEY` |
-| Google | `veo-3.1-fast-generate-preview` | 1 image or 1 video | `GEMINI_API_KEY`, `GOOGLE_API_KEY` |
-| MiniMax | `MiniMax-Hailuo-2.3` | 1 image | `MINIMAX_API_KEY` |
-| OpenAI | `sora-2` | 1 image or 1 video | `OPENAI_API_KEY` |
-| Qwen | `wan2.6-t2v` | Yes, remote URLs | `QWEN_API_KEY`, `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY` |
-| Runway | `gen4.5` | 1 image or 1 video | `RUNWAYML_API_SECRET`, `RUNWAY_API_KEY` |
-| Together | `Wan-AI/Wan2.2-T2V-A14B` | 1 image | `TOGETHER_API_KEY` |
-| xAI | `grok-imagine-video` | 1 image or 1 video | `XAI_API_KEY` |
-
-Use `action: "list"` to inspect available providers and models at runtime:
-
-```
-/tool video_generate action=list
-```
-
-## Tool parameters
-
-| Parameter | Type | Description |
-| ----------------- | -------- | ------------------------------------------------------------------------------------------------- |
-| `prompt` | string | Video generation prompt (required for `action: "generate"`) |
-| `action` | string | `"generate"` (default), `"status"` for the current session task, or `"list"` to inspect providers |
-| `model` | string | Provider/model override, e.g. `qwen/wan2.6-t2v` |
-| `image` | string | Single reference image path or URL |
-| `images` | string[] | Multiple reference images (up to 5) |
-| `video` | string | Single reference video path or URL |
-| `videos` | string[] | Multiple reference videos (up to 4) |
-| `size` | string | Size hint when the provider supports it |
-| `aspectRatio` | string | Aspect ratio: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
-| `resolution` | string | Resolution hint: `480P`, `720P`, or `1080P` |
-| `durationSeconds` | number | Target duration in seconds. OpenClaw may round to the nearest provider-supported value |
-| `audio` | boolean | Enable generated audio when the provider supports it |
-| `watermark` | boolean | Toggle provider watermarking when supported |
-| `filename` | string | Output filename hint |
-
-Not all providers support all parameters. Unsupported optional overrides are ignored on a best-effort basis and reported back in the tool result as a warning. Hard capability limits such as too many reference inputs still fail before submission. When a provider or model only supports a discrete set of video lengths, OpenClaw rounds `durationSeconds` to the nearest supported value and reports the normalized duration in the tool result.
-
-## Async behavior
-
-- Session-backed agent runs: `video_generate` creates a background task, returns a started/task response immediately, and posts the finished video later in a follow-up agent message.
-- Duplicate prevention: while that background task is still `queued` or `running`, later `video_generate` calls in the same session return task status instead of starting another generation.
-- Status lookup: use `action: "status"` to inspect the active session-backed video task without starting a new one.
-- Task tracking: use `openclaw tasks list` / `openclaw tasks show ` to inspect queued, running, and terminal status for the generation.
-- Completion wake: OpenClaw injects an internal completion event back into the same session so the model can write the user-facing follow-up itself.
-- Prompt hint: later user/manual turns in the same session get a small runtime hint when a video task is already in flight so the model does not blindly call `video_generate` again.
-- No-session fallback: direct/local contexts without a real agent session still run inline and return the final video result in the same turn.
+| Provider | Notes |
+| -------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
+| Alibaba | Uses DashScope/Model Studio async endpoint. Reference images and videos must be remote `http(s)` URLs. |
+| BytePlus | Single image reference only. |
+| fal | Uses queue-backed flow for long-running jobs. Single image reference only. |
+| Google | Uses Gemini/Veo. Supports one image or one video reference. |
+| MiniMax | Single image reference only. |
+| OpenAI | Only `size` override is forwarded. Other style overrides (`aspectRatio`, `resolution`, `audio`, `watermark`) are ignored with a warning. |
+| Qwen | Same DashScope backend as Alibaba. Reference inputs must be remote `http(s)` URLs; local files are rejected upfront. |
+| Runway | Supports local files via data URIs. Video-to-video requires `runway/gen4_aleph`. Text-only runs expose `16:9` and `9:16` aspect ratios. |
+| Together | Single image reference only. |
+| xAI | Supports text-to-video, image-to-video, and remote video edit/extend flows. |
## Configuration
-### Model selection
+Set the default video generation model in your OpenClaw config:
```json5
{
@@ -111,45 +167,25 @@ Not all providers support all parameters. Unsupported optional overrides are ign
}
```
-### Provider selection order
+Or via the CLI:
-When generating a video, OpenClaw tries providers in this order:
-
-1. **`model` parameter** from the tool call (if the agent specifies one)
-2. **`videoGenerationModel.primary`** from config
-3. **`videoGenerationModel.fallbacks`** in order
-4. **Auto-detection** — uses auth-backed provider defaults only:
- - current default provider first
- - remaining registered video-generation providers in provider-id order
-
-If a provider fails, the next candidate is tried automatically. If all fail, the error includes details from each attempt.
-
-## Provider notes
-
-- Alibaba uses the DashScope / Model Studio async video endpoint and currently requires remote `http(s)` URLs for reference assets.
-- Google uses Gemini/Veo and supports a single image or video reference input.
-- MiniMax, Together, BytePlus, and fal currently support a single image reference input.
-- OpenAI uses the native video endpoint and currently defaults to `sora-2`.
-- Qwen supports image/video references, but the upstream DashScope video endpoint currently requires remote `http(s)` URLs for those references.
-- Runway uses the native async task API with `GET /v1/tasks/{id}` polling and currently defaults to `gen4.5`.
-- xAI uses the native xAI video API and supports text-to-video, image-to-video, and remote video edit/extend flows.
-- fal uses the queue-backed fal video flow for long-running jobs instead of a single blocking inference request.
-
-## Qwen reference inputs
-
-The bundled Qwen provider supports text-to-video plus image/video reference modes, but the upstream DashScope video endpoint currently requires **remote http(s) URLs** for reference inputs. Local file paths and uploaded buffers are rejected up front instead of being silently ignored.
+```bash
+openclaw config set agents.defaults.videoGenerationModel.primary "qwen/wan2.6-t2v"
+```
## Related
-- [Tools Overview](/tools) — all available agent tools
-- [Background Tasks](/automation/tasks) — task tracking for detached `video_generate` runs
-- [Alibaba Model Studio](/providers/alibaba) — direct Wan provider setup
-- [Google (Gemini)](/providers/google) — Veo provider setup
-- [MiniMax](/providers/minimax) — Hailuo provider setup
-- [OpenAI](/providers/openai) — Sora provider setup
-- [Qwen](/providers/qwen) — Qwen-specific setup and limits
-- [Runway](/providers/runway) — Runway setup and current model/input notes
-- [Together AI](/providers/together) — Together Wan provider setup
-- [xAI](/providers/xai) — Grok video provider setup
-- [Configuration Reference](/gateway/configuration-reference#agent-defaults) — `videoGenerationModel` config
-- [Models](/concepts/models) — model configuration and failover
+- [Tools Overview](/tools)
+- [Background Tasks](/automation/tasks) -- task tracking for async video generation
+- [Alibaba Model Studio](/providers/alibaba)
+- [BytePlus](/providers/byteplus)
+- [fal](/providers/fal)
+- [Google (Gemini)](/providers/google)
+- [MiniMax](/providers/minimax)
+- [OpenAI](/providers/openai)
+- [Qwen](/providers/qwen)
+- [Runway](/providers/runway)
+- [Together AI](/providers/together)
+- [xAI](/providers/xai)
+- [Configuration Reference](/gateway/configuration-reference#agent-defaults)
+- [Models](/concepts/models)