feat(providers): add DeepInfra provider plugin (#73038)

* feat(providers): add DeepInfra provider plugin

* feat(deepinfra): add media provider surfaces

* fix(deepinfra): satisfy provider boundary checks

* docs: add gitcrawl maintainer skill

* test: include deepinfra in live media sweeps

* fix: remove stale tts contract import
This commit is contained in:
Peter Steinberger
2026-04-28 01:12:54 +01:00
committed by GitHub
parent 1fde7dbc0e
commit 0294aebe6f
54 changed files with 2830 additions and 179 deletions

View File

@@ -95,6 +95,10 @@
"source": "Chutes",
"target": "Chutes"
},
{
"source": "DeepInfra",
"target": "DeepInfra"
},
{
"source": "Qwen",
"target": "Qwen"

View File

@@ -19,7 +19,7 @@ a per-agent SQLite database and needs no extra dependencies to get started.
## Getting started
If you have an API key for OpenAI, Gemini, Voyage, or Mistral, the builtin
If you have an API key for OpenAI, Gemini, Voyage, Mistral, or DeepInfra, the builtin
engine auto-detects it and enables vector search. No config needed.
To set a provider explicitly:
@@ -60,14 +60,15 @@ at a GGUF file:
## Supported embedding providers
| Provider | ID | Auto-detected | Notes |
| -------- | --------- | ------------- | ----------------------------------- |
| OpenAI | `openai` | Yes | Default: `text-embedding-3-small` |
| Gemini | `gemini` | Yes | Supports multimodal (image + audio) |
| Voyage | `voyage` | Yes | |
| Mistral | `mistral` | Yes | |
| Ollama | `ollama` | No | Local, set explicitly |
| Local | `local` | Yes (first) | Optional `node-llama-cpp` runtime |
| Provider | ID | Auto-detected | Notes |
| --------- | ----------- | ------------- | ----------------------------------- |
| OpenAI | `openai` | Yes | Default: `text-embedding-3-small` |
| Gemini | `gemini` | Yes | Supports multimodal (image + audio) |
| Voyage | `voyage` | Yes | |
| Mistral | `mistral` | Yes | |
| DeepInfra | `deepinfra` | Yes | Default: `BAAI/bge-m3` |
| Ollama | `ollama` | No | Local, set explicitly |
| Local | `local` | Yes (first) | Optional `node-llama-cpp` runtime |
Auto-detection picks the first provider whose API key can be resolved, in the
order shown. Set `memorySearch.provider` to override.

View File

@@ -280,6 +280,7 @@ See [/providers/kilocode](/providers/kilocode) for setup details.
| BytePlus | `byteplus` / `byteplus-plan` | `BYTEPLUS_API_KEY` | `byteplus-plan/ark-code-latest` |
| Cerebras | `cerebras` | `CEREBRAS_API_KEY` | `cerebras/zai-glm-4.7` |
| Cloudflare AI Gateway | `cloudflare-ai-gateway` | `CLOUDFLARE_AI_GATEWAY_API_KEY` | — |
| DeepInfra | `deepinfra` | `DEEPINFRA_API_KEY` | `deepinfra/deepseek-ai/DeepSeek-V3.2` |
| DeepSeek | `deepseek` | `DEEPSEEK_API_KEY` | `deepseek/deepseek-v4-flash` |
| GitHub Copilot | `github-copilot` | `COPILOT_GITHUB_TOKEN` / `GH_TOKEN` / `GITHUB_TOKEN` | — |
| Groq | `groq` | `GROQ_API_KEY` | — |

View File

@@ -1331,6 +1331,7 @@
"providers/cloudflare-ai-gateway",
"providers/comfy",
"providers/deepgram",
"providers/deepinfra",
"providers/deepseek",
"providers/elevenlabs",
"providers/fal",

View File

@@ -468,6 +468,7 @@ If you want to rely on env keys (e.g. exported in your `~/.profile`), run local
- `<provider>:generate`
- `<provider>:edit` when the provider declares edit support
- Current bundled providers covered:
- `deepinfra`
- `fal`
- `google`
- `minimax`
@@ -477,6 +478,7 @@ If you want to rely on env keys (e.g. exported in your `~/.profile`), run local
- `xai`
- Optional narrowing:
- `OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS="openai,google,openrouter,xai"`
- `OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS="deepinfra"`
- `OPENCLAW_LIVE_IMAGE_GENERATION_MODELS="openai/gpt-image-2,google/gemini-3.1-flash-image-preview,openrouter/google/gemini-3.1-flash-image-preview,xai/grok-imagine-image"`
- `OPENCLAW_LIVE_IMAGE_GENERATION_CASES="google:flash-generate,google:pro-edit,openrouter:generate,xai:default-generate,xai:default-edit"`
- Optional auth behavior:
@@ -551,7 +553,7 @@ image-generation runtime, and the live provider request.
- `google` because the current shared Gemini/Veo lane uses local buffer-backed input and that path is not accepted in the shared sweep
- `openai` because the current shared lane lacks org-specific video inpaint/remix access guarantees
- Optional narrowing:
- `OPENCLAW_LIVE_VIDEO_GENERATION_PROVIDERS="google,openai,runway"`
- `OPENCLAW_LIVE_VIDEO_GENERATION_PROVIDERS="deepinfra,google,openai,runway"`
- `OPENCLAW_LIVE_VIDEO_GENERATION_MODELS="google/veo-3.1-fast-generate-preview,openai/sora-2,runway/gen4_aleph"`
- `OPENCLAW_LIVE_VIDEO_GENERATION_SKIP_PROVIDERS=""` to include every provider in the default sweep, including FAL
- `OPENCLAW_LIVE_VIDEO_GENERATION_TIMEOUT_MS=60000` to reduce each provider operation cap for an aggressive smoke run

View File

@@ -0,0 +1,83 @@
---
summary: "Use DeepInfra's unified API to access the most popular open source and frontier models in OpenClaw"
read_when:
- You want a single API key for the top open source LLMs
- You want to run models via DeepInfra's API in OpenClaw
---
# DeepInfra
DeepInfra provides a **unified API** that routes requests to the most popular open source and frontier models behind a single
endpoint and API key. It is OpenAI-compatible, so most OpenAI SDKs work by switching the base URL.
## Getting an API key
1. Go to [https://deepinfra.com/](https://deepinfra.com/)
2. Sign in or create an account
3. Navigate to Dashboard / Keys and generate a new API key or use the auto created one
## CLI setup
```bash
openclaw onboard --deepinfra-api-key <key>
```
Or set the environment variable:
```bash
export DEEPINFRA_API_KEY="<your-deepinfra-api-key>" # pragma: allowlist secret
```
## Config snippet
```json5
{
env: { DEEPINFRA_API_KEY: "<your-deepinfra-api-key>" }, // pragma: allowlist secret
agents: {
defaults: {
model: { primary: "deepinfra/deepseek-ai/DeepSeek-V3.2" },
},
},
}
```
## Supported OpenClaw surfaces
The bundled plugin registers all DeepInfra surfaces that match current
OpenClaw provider contracts:
| Surface | Default model | OpenClaw config/tool |
| ------------------------ | ---------------------------------- | -------------------------------------------------------- |
| Chat / model provider | `deepseek-ai/DeepSeek-V3.2` | `agents.defaults.model` |
| Image generation/editing | `black-forest-labs/FLUX-1-schnell` | `image_generate`, `agents.defaults.imageGenerationModel` |
| Media understanding | `moonshotai/Kimi-K2.5` for images | inbound image understanding |
| Speech-to-text | `openai/whisper-large-v3-turbo` | inbound audio transcription |
| Text-to-speech | `hexgrad/Kokoro-82M` | `messages.tts.provider: "deepinfra"` |
| Video generation | `Pixverse/Pixverse-T2V` | `video_generate`, `agents.defaults.videoGenerationModel` |
| Memory embeddings | `BAAI/bge-m3` | `agents.defaults.memorySearch.provider: "deepinfra"` |
DeepInfra also exposes reranking, classification, object-detection, and other
native model types. OpenClaw does not currently have first-class provider
contracts for those categories, so this plugin does not register them yet.
## Available models
OpenClaw dynamically discovers available DeepInfra models at startup. Use
`/models deepinfra` to see the full list of models available.
Any model available on [DeepInfra.com](https://deepinfra.com/) can be used with the `deepinfra/` prefix:
```
deepinfra/MiniMaxAI/MiniMax-M2.5
deepinfra/deepseek-ai/DeepSeek-V3.2
deepinfra/moonshotai/Kimi-K2.5
deepinfra/zai-org/GLM-5.1
...and many more
```
## Notes
- Model refs are `deepinfra/<provider>/<model>` (e.g., `deepinfra/Qwen/Qwen3-Max`).
- Default model: `deepinfra/deepseek-ai/DeepSeek-V3.2`
- Base URL: `https://api.deepinfra.com/v1/openai`
- Native video generation uses `https://api.deepinfra.com/v1/inference/<model>`.

View File

@@ -31,6 +31,7 @@ model as `provider/model`.
- [Chutes](/providers/chutes)
- [ComfyUI](/providers/comfy)
- [Cloudflare AI Gateway](/providers/cloudflare-ai-gateway)
- [DeepInfra](/providers/deepinfra)
- [fal](/providers/fal)
- [Fireworks](/providers/fireworks)
- [GLM models](/providers/glm)

View File

@@ -84,8 +84,8 @@ See [Models](/providers/models) for pricing config and [Token use & costs](/refe
Inbound media can be summarized/transcribed before the reply runs. This uses model/provider APIs.
- Audio: OpenAI / Groq / Deepgram / Google / Mistral.
- Image: OpenAI / OpenRouter / Anthropic / Google / MiniMax / Moonshot / Qwen / Z.AI.
- Audio: OpenAI / Groq / Deepgram / DeepInfra / Google / Mistral.
- Image: OpenAI / OpenRouter / Anthropic / DeepInfra / Google / MiniMax / Moonshot / Qwen / Z.AI.
- Video: Google / Qwen / Moonshot.
See [Media understanding](/nodes/media-understanding).
@@ -94,8 +94,8 @@ See [Media understanding](/nodes/media-understanding).
Shared generation capabilities can also spend provider keys:
- Image generation: OpenAI / Google / fal / MiniMax
- Video generation: Qwen
- Image generation: OpenAI / Google / DeepInfra / fal / MiniMax
- Video generation: DeepInfra / Qwen
Image generation can infer an auth-backed provider default when
`agents.defaults.imageGenerationModel` is unset. Video generation currently
@@ -113,6 +113,7 @@ Semantic memory search uses **embedding APIs** when configured for remote provid
- `memorySearch.provider = "gemini"` → Gemini embeddings
- `memorySearch.provider = "voyage"` → Voyage embeddings
- `memorySearch.provider = "mistral"` → Mistral embeddings
- `memorySearch.provider = "deepinfra"` → DeepInfra embeddings
- `memorySearch.provider = "lmstudio"` → LM Studio embeddings (local/self-hosted)
- `memorySearch.provider = "ollama"` → Ollama embeddings (local/self-hosted; typically no hosted API billing)
- Optional fallback to a remote provider if local embeddings fail

View File

@@ -46,12 +46,12 @@ See [Active Memory](/concepts/active-memory) for the activation model, plugin-ow
## Provider selection
| Key | Type | Default | Description |
| ---------- | --------- | ---------------- | ------------------------------------------------------------------------------------------------------------- |
| `provider` | `string` | auto-detected | Embedding adapter ID: `bedrock`, `gemini`, `github-copilot`, `local`, `mistral`, `ollama`, `openai`, `voyage` |
| `model` | `string` | provider default | Embedding model name |
| `fallback` | `string` | `"none"` | Fallback adapter ID when the primary fails |
| `enabled` | `boolean` | `true` | Enable or disable memory search |
| Key | Type | Default | Description |
| ---------- | --------- | ---------------- | -------------------------------------------------------------------------------------------------------------------------- |
| `provider` | `string` | auto-detected | Embedding adapter ID: `bedrock`, `deepinfra`, `gemini`, `github-copilot`, `local`, `mistral`, `ollama`, `openai`, `voyage` |
| `model` | `string` | provider default | Embedding model name |
| `fallback` | `string` | `"none"` | Fallback adapter ID when the primary fails |
| `enabled` | `boolean` | `true` | Enable or disable memory search |
### Auto-detection order
@@ -76,6 +76,9 @@ When `provider` is not set, OpenClaw selects the first available:
<Step title="mistral">
Selected if a Mistral key can be resolved.
</Step>
<Step title="deepinfra">
Selected if a DeepInfra key can be resolved.
</Step>
<Step title="bedrock">
Selected if the AWS SDK credential chain resolves (instance role, access keys, profile, SSO, web identity, or shared config).
</Step>
@@ -87,15 +90,16 @@ When `provider` is not set, OpenClaw selects the first available:
Remote embeddings require an API key. Bedrock uses the AWS SDK default credential chain instead (instance roles, SSO, access keys).
| Provider | Env var | Config key |
| -------------- | -------------------------------------------------- | --------------------------------- |
| Bedrock | AWS credential chain | No API key needed |
| Gemini | `GEMINI_API_KEY` | `models.providers.google.apiKey` |
| GitHub Copilot | `COPILOT_GITHUB_TOKEN`, `GH_TOKEN`, `GITHUB_TOKEN` | Auth profile via device login |
| Mistral | `MISTRAL_API_KEY` | `models.providers.mistral.apiKey` |
| Ollama | `OLLAMA_API_KEY` (placeholder) | -- |
| OpenAI | `OPENAI_API_KEY` | `models.providers.openai.apiKey` |
| Voyage | `VOYAGE_API_KEY` | `models.providers.voyage.apiKey` |
| Provider | Env var | Config key |
| -------------- | -------------------------------------------------- | ----------------------------------- |
| Bedrock | AWS credential chain | No API key needed |
| DeepInfra | `DEEPINFRA_API_KEY` | `models.providers.deepinfra.apiKey` |
| Gemini | `GEMINI_API_KEY` | `models.providers.google.apiKey` |
| GitHub Copilot | `COPILOT_GITHUB_TOKEN`, `GH_TOKEN`, `GITHUB_TOKEN` | Auth profile via device login |
| Mistral | `MISTRAL_API_KEY` | `models.providers.mistral.apiKey` |
| Ollama | `OLLAMA_API_KEY` (placeholder) | -- |
| OpenAI | `OPENAI_API_KEY` | `models.providers.openai.apiKey` |
| Voyage | `VOYAGE_API_KEY` | `models.providers.voyage.apiKey` |
<Note>
Codex OAuth covers chat/completions only and does not satisfy embedding requests.

View File

@@ -1,5 +1,5 @@
---
summary: "Generate and edit images via image_generate across OpenAI, Google, fal, MiniMax, ComfyUI, OpenRouter, LiteLLM, xAI, Vydra"
summary: "Generate and edit images via image_generate across OpenAI, Google, fal, MiniMax, ComfyUI, DeepInfra, OpenRouter, LiteLLM, xAI, Vydra"
read_when:
- Generating or editing images via the agent
- Configuring image-generation providers and models
@@ -71,6 +71,7 @@ internal image endpoints remain blocked by default.
| OpenAI image generation with API billing | `openai/gpt-image-2` | `OPENAI_API_KEY` |
| OpenAI image generation with Codex subscription auth | `openai/gpt-image-2` | OpenAI Codex OAuth |
| OpenAI transparent-background PNG/WebP | `openai/gpt-image-1.5` | `OPENAI_API_KEY` or OpenAI Codex OAuth |
| DeepInfra image generation | `deepinfra/black-forest-labs/FLUX-1-schnell` | `DEEPINFRA_API_KEY` |
| OpenRouter image generation | `openrouter/google/gemini-3.1-flash-image-preview` | `OPENROUTER_API_KEY` |
| LiteLLM image generation | `litellm/gpt-image-2` | `LITELLM_API_KEY` |
| Google Gemini image generation | `google/gemini-3.1-flash-image-preview` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
@@ -88,6 +89,7 @@ backend emits it.
| Provider | Default model | Edit support | Auth |
| ---------- | --------------------------------------- | ---------------------------------- | ----------------------------------------------------- |
| ComfyUI | `workflow` | Yes (1 image, workflow-configured) | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for cloud |
| DeepInfra | `black-forest-labs/FLUX-1-schnell` | Yes (1 image) | `DEEPINFRA_API_KEY` |
| fal | `fal-ai/flux/dev` | Yes | `FAL_KEY` |
| Google | `gemini-3.1-flash-image-preview` | Yes | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
| LiteLLM | `gpt-image-2` | Yes (up to 5 input images) | `LITELLM_API_KEY` |
@@ -105,13 +107,13 @@ Use `action: "list"` to inspect available providers and models at runtime:
## Provider capabilities
| Capability | ComfyUI | fal | Google | MiniMax | OpenAI | Vydra | xAI |
| --------------------- | ------------------ | ----------------- | -------------- | --------------------- | -------------- | ----- | -------------- |
| Generate (max count) | Workflow-defined | 4 | 4 | 9 | 4 | 1 | 4 |
| Edit / reference | 1 image (workflow) | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | — | Up to 5 images |
| Size control | — | ✓ | ✓ | — | Up to 4K | — | — |
| Aspect ratio | — | ✓ (generate only) | ✓ | ✓ | — | — | ✓ |
| Resolution (1K/2K/4K) | — | ✓ | ✓ | — | — | — | 1K, 2K |
| Capability | ComfyUI | DeepInfra | fal | Google | MiniMax | OpenAI | Vydra | xAI |
| --------------------- | ------------------ | --------- | ----------------- | -------------- | --------------------- | -------------- | ----- | -------------- |
| Generate (max count) | Workflow-defined | 4 | 4 | 4 | 9 | 4 | 1 | 4 |
| Edit / reference | 1 image (workflow) | 1 image | 1 image | Up to 5 images | 1 image (subject ref) | Up to 5 images | — | Up to 5 images |
| Size control | — | ✓ | ✓ | ✓ | — | Up to 4K | — | — |
| Aspect ratio | — | — | ✓ (generate only) | ✓ | ✓ | — | — | ✓ |
| Resolution (1K/2K/4K) | — | — | ✓ | ✓ | — | — | — | 1K, 2K |
## Tool parameters
@@ -226,7 +228,7 @@ from each attempt.
### Image editing
OpenAI, OpenRouter, Google, fal, MiniMax, ComfyUI, and xAI support editing
OpenAI, OpenRouter, Google, DeepInfra, fal, MiniMax, ComfyUI, and xAI support editing
reference images. Pass a reference image path or URL:
```text

View File

@@ -50,6 +50,7 @@ provider is configured.
| Alibaba | | ✓ | | | | | |
| BytePlus | | ✓ | | | | | |
| ComfyUI | ✓ | ✓ | ✓ | | | | |
| DeepInfra | ✓ | ✓ | | ✓ | ✓ | | ✓ |
| Deepgram | | | | | ✓ | ✓ | |
| ElevenLabs | | | | ✓ | ✓ | | |
| fal | ✓ | ✓ | | | | | |
@@ -94,7 +95,7 @@ original channel.
## Speech-to-text and Voice Call
Deepgram, ElevenLabs, Mistral, OpenAI, SenseAudio, and xAI can all transcribe
Deepgram, DeepInfra, ElevenLabs, Mistral, OpenAI, SenseAudio, and xAI can all transcribe
inbound audio through the batch `tools.media.audio` path when configured.
Channel plugins that preflight a voice note for mention gating or command
parsing mark the transcribed attachment on the inbound context, so the shared
@@ -116,6 +117,13 @@ vendor without waiting for a completed recording.
Image, video, batch TTS, batch STT, Voice Call streaming STT, backend
realtime voice, and memory-embedding surfaces.
</Accordion>
<Accordion title="DeepInfra">
Chat/model routing, image generation/editing, text-to-video, batch TTS,
batch STT, image media understanding, and memory-embedding surfaces.
DeepInfra-native rerank/classification/object-detection models are not
registered until OpenClaw has dedicated provider contracts for those
categories.
</Accordion>
<Accordion title="xAI">
Image, video, search, code-execution, batch TTS, batch STT, and Voice
Call streaming STT. xAI Realtime voice is an upstream capability but is

View File

@@ -8,7 +8,7 @@ title: "Text-to-speech"
sidebarTitle: "Text to speech (TTS)"
---
OpenClaw can convert outbound replies into audio across **13 speech providers**
OpenClaw can convert outbound replies into audio across **14 speech providers**
and deliver native voice messages on Feishu, Matrix, Telegram, and WhatsApp,
audio attachments everywhere else, and PCM/Ulaw streams for telephony and Talk.
@@ -55,6 +55,7 @@ OpenClaw picks the first configured provider in registry auto-select order.
| Provider | Auth | Notes |
| ----------------- | ---------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- |
| **Azure Speech** | `AZURE_SPEECH_KEY` + `AZURE_SPEECH_REGION` (also `AZURE_SPEECH_API_KEY`, `SPEECH_KEY`, `SPEECH_REGION`) | Native Ogg/Opus voice-note output and telephony. |
| **DeepInfra** | `DEEPINFRA_API_KEY` | OpenAI-compatible TTS. Defaults to `hexgrad/Kokoro-82M`. |
| **ElevenLabs** | `ELEVENLABS_API_KEY` or `XI_API_KEY` | Voice cloning, multilingual, deterministic via `seed`. |
| **Google Gemini** | `GEMINI_API_KEY` or `GOOGLE_API_KEY` | Gemini API TTS; persona-aware via `promptTemplate: "audio-profile-v1"`. |
| **Gradium** | `GRADIUM_API_KEY` | Voice-note and telephony output. |

View File

@@ -9,7 +9,7 @@ sidebarTitle: "Video generation"
---
OpenClaw agents can generate videos from text prompts, reference images, or
existing videos. Fourteen provider backends are supported, each with
existing videos. Fifteen provider backends are supported, each with
different model options, input modes, and feature sets. The agent picks the
right provider automatically based on your configuration and available API
keys.
@@ -111,6 +111,7 @@ generation.
| BytePlus Seedance 1.5 | `seedance-1-5-pro-251215` | ✓ | Up to 2 images (first + last frame via role) | — | `BYTEPLUS_API_KEY` |
| BytePlus Seedance 2.0 | `dreamina-seedance-2-0-260128` | ✓ | Up to 9 reference images | Up to 3 videos | `BYTEPLUS_API_KEY` |
| ComfyUI | `workflow` | ✓ | 1 image | — | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` |
| DeepInfra | `Pixverse/Pixverse-T2V` | ✓ | — | — | `DEEPINFRA_API_KEY` |
| fal | `fal-ai/minimax/video-01-live` | ✓ | 1 image; up to 9 with Seedance reference-to-video | Up to 3 videos with Seedance reference-to-video | `FAL_KEY` |
| Google | `veo-3.1-fast-generate-preview` | ✓ | 1 image | 1 video | `GEMINI_API_KEY` |
| MiniMax | `MiniMax-Hailuo-2.3` | ✓ | 1 image | — | `MINIMAX_API_KEY` or MiniMax OAuth |
@@ -132,20 +133,21 @@ runtime modes at runtime.
The explicit mode contract used by `video_generate`, contract tests, and
the shared live sweep:
| Provider | `generate` | `imageToVideo` | `videoToVideo` | Shared live lanes today |
| -------- | :--------: | :------------: | :------------: | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Alibaba | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| BytePlus | ✓ | ✓ | — | `generate`, `imageToVideo` |
| ComfyUI | ✓ | ✓ | — | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
| fal | ✓ | | | `generate`, `imageToVideo`; `videoToVideo` only when using Seedance reference-to-video |
| Google | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because the current buffer-backed Gemini/Veo sweep does not accept that input |
| MiniMax | ✓ | ✓ | | `generate`, `imageToVideo` |
| OpenAI | ✓ | ✓ | | `generate`, `imageToVideo`; shared `videoToVideo` skipped because this org/input path currently needs provider-side inpaint/remix access |
| Qwen | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| Runway | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` runs only when the selected model is `runway/gen4_aleph` |
| Together | ✓ | ✓ | | `generate`, `imageToVideo` |
| Vydra | ✓ | ✓ | — | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
| xAI | ✓ | ✓ | | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider currently needs a remote MP4 URL |
| Provider | `generate` | `imageToVideo` | `videoToVideo` | Shared live lanes today |
| --------- | :--------: | :------------: | :------------: | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Alibaba | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| BytePlus | ✓ | ✓ | — | `generate`, `imageToVideo` |
| ComfyUI | ✓ | ✓ | — | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
| DeepInfra | ✓ | | | `generate`; native DeepInfra video schemas are text-to-video in the bundled contract |
| fal | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` only when using Seedance reference-to-video |
| Google | ✓ | ✓ | | `generate`, `imageToVideo`; shared `videoToVideo` skipped because the current buffer-backed Gemini/Veo sweep does not accept that input |
| MiniMax | ✓ | ✓ | | `generate`, `imageToVideo` |
| OpenAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; shared `videoToVideo` skipped because this org/input path currently needs provider-side inpaint/remix access |
| Qwen | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider needs remote `http(s)` video URLs |
| Runway | ✓ | ✓ | | `generate`, `imageToVideo`; `videoToVideo` runs only when the selected model is `runway/gen4_aleph` |
| Together | ✓ | ✓ | — | `generate`, `imageToVideo` |
| Vydra | ✓ | ✓ | | `generate`; shared `imageToVideo` skipped because bundled `veo3` is text-only and bundled `kling` requires a remote image URL |
| xAI | ✓ | ✓ | ✓ | `generate`, `imageToVideo`; `videoToVideo` skipped because this provider currently needs a remote MP4 URL |
## Tool parameters