feat(openrouter): add video generation provider (#72700)

Adds OpenRouter video generation via video_generate, with hardened async polling/download handling, docs, and regression coverage.

Validation:
- pnpm test src/plugins/plugin-lookup-table.test.ts src/secrets/target-registry.fast-path.test.ts src/gateway/server-startup-post-attach.test.ts extensions/openrouter/video-generation-provider.test.ts src/video-generation/live-test-helpers.test.ts src/media-generation/provider-capabilities.contract.test.ts src/agents/pi-embedded-helpers/failover-matches.test.ts src/plugins/manifest-metadata-scan.test.ts src/agents/openai-transport-stream.test.ts src/media-understanding/openai-compatible-audio.test.ts src/agents/schema-normalization-runtime-contract.test.ts src/agents/provider-request-config.test.ts src/plugin-sdk/provider-stream.test.ts src/agents/pi-embedded-runner/run/attempt.spawn-workspace.websocket.test.ts -- --reporter=verbose
- OPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_TEST_QUIET=0 OPENCLAW_LIVE_VIDEO_GENERATION_MODELS=openrouter/google/veo-3.1-fast pnpm test:live src/video-generation/video-generation.live.test.ts -- --runInBand

Co-authored-by: notamicrodose <gabrielkripalani@me.com>
This commit is contained in:
Gabriel Kripalani
2026-04-28 11:57:31 +02:00
committed by GitHub
parent 5915489631
commit 17ef9ef895
19 changed files with 957 additions and 21 deletions

View File

@@ -61,6 +61,7 @@ provider is configured.
| MiniMax | ✓ | ✓ | ✓ | ✓ | | | |
| Mistral | | | | | ✓ | | |
| OpenAI | ✓ | ✓ | | ✓ | ✓ | ✓ | ✓ |
| OpenRouter | ✓ | ✓ | | ✓ | | | ✓ |
| Qwen | | ✓ | | | | | |
| Runway | | ✓ | | | | | |
| SenseAudio | | | | | ✓ | | |

View File

@@ -1,5 +1,5 @@
---
summary: "Generate videos via video_generate from text, image, or video references across 14 provider backends"
summary: "Generate videos via video_generate from text, image, or video references across 16 provider backends"
read_when:
- Generating videos via the agent
- Configuring video-generation providers and models
@@ -9,7 +9,7 @@ sidebarTitle: "Video generation"
---
OpenClaw agents can generate videos from text prompts, reference images, or
existing videos. Fifteen provider backends are supported, each with
existing videos. Sixteen provider backends are supported, each with
different model options, input modes, and feature sets. The agent picks the
right provider automatically based on your configuration and available API
keys.
@@ -116,6 +116,7 @@ generation.
| Google | `veo-3.1-fast-generate-preview` | ✓ | 1 image | 1 video | `GEMINI_API_KEY` |
| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | — | `MINIMAX_API_KEY` or MiniMax OAuth |
| OpenAI | `sora-2` | ✓ | 1 image | 1 video | `OPENAI_API_KEY` |
| OpenRouter | `google/veo-3.1-fast` | ✓ | Up to 4 images (first/last frame or references) | — | `OPENROUTER_API_KEY` |
| Qwen | `wan2.6-t2v` | ✓ | Yes (remote URL) | Yes (remote URL) | `QWEN_API_KEY` |
| Runway | `gen4.5` | ✓ | 1 image | 1 video | `RUNWAYML_API_SECRET` |
| Together | `Wan-AI/Wan2.2-T2V-A14B` | ✓ | 1 image | — | `TOGETHER_API_KEY` |
@@ -133,21 +134,22 @@ runtime modes at runtime.
The explicit mode contract used by `video_generate`, contract tests, and
the shared live sweep:
| Provider | `generate` | `imageToVideo` | `videoToVideo` | Shared live lanes today |
| --------- | :--------: | :------------: | :------------: | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Alibaba | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| BytePlus | ✓ | ✓ | — | `generate`, `imageToVideo` |
| ComfyUI | ✓ | ✓ | — | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
| DeepInfra | ✓ | — | — | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract |
| fal | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` only when using Seedance reference-to-video |
| Google | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because the current buffer-backed Gemini/Veo sweep does not accept that input |
| MiniMax | ✓ | ✓ | — | `generate`, `imageToVideo` |
| OpenAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because this org/input path currently needs provider-side inpaint/remix access |
| Qwen | ✓ | ✓ | | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| Runway | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` runs only when the selected model is `runway/gen4_aleph` |
| Together | ✓ | ✓ | | `generate`, `imageToVideo` |
| Vydra | ✓ | ✓ | — | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
| xAI | ✓ | ✓ | | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider currently needs a remote MP4 URL |
| Provider | `generate` | `imageToVideo` | `videoToVideo` | Shared live lanes today |
| ---------- | :--------: | :------------: | :------------: | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Alibaba | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| BytePlus | ✓ | ✓ | — | `generate`, `imageToVideo` |
| ComfyUI | ✓ | ✓ | — | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
| DeepInfra | ✓ | — | — | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract |
| fal | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` only when using Seedance reference-to-video |
| Google | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because the current buffer-backed Gemini/Veo sweep does not accept that input |
| MiniMax | ✓ | ✓ | — | `generate`, `imageToVideo` |
| OpenAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because this org/input path currently needs provider-side inpaint/remix access |
| OpenRouter | ✓ | ✓ | | `generate`, `imageToVideo` |
| Qwen | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| Runway | ✓ | ✓ | | `generate`, `imageToVideo`; `videoToVideo` runs only when the selected model is `runway/gen4_aleph` |
| Together | ✓ | ✓ | — | `generate`, `imageToVideo` |
| Vydra | ✓ | ✓ | | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
| xAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider currently needs a remote MP4 URL |
## Tool parameters
@@ -389,6 +391,13 @@ only the explicit `model`, `primary`, and `fallbacks` entries.
(`aspectRatio`, `resolution`, `audio`, `watermark`) are ignored with
a warning.
</Accordion>
<Accordion title="OpenRouter">
Uses OpenRouter's asynchronous `/videos` API. OpenClaw submits the
job, polls `polling_url`, and downloads either `unsigned_urls` or the
documented job content endpoint. The bundled `google/veo-3.1-fast` default
advertises 4/6/8 second durations, `720P`/`1080P` resolutions, and
`16:9`/`9:16` aspect ratios.
</Accordion>
<Accordion title="Qwen">
Same DashScope backend as Alibaba. Reference inputs must be remote
`http(s)` URLs; local files are rejected upfront.