mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-18 17:34:45 +00:00
feat: add fal and OpenRouter music generation (#82789)
* feat: add fal and OpenRouter music generation * fix: repair music generation CI gates * chore: refresh proof gate
This commit is contained in:
committed by
GitHub
parent
562d460d75
commit
f453904165
@@ -1,2 +1,2 @@
|
||||
1b2d60a1ce15bdac9db5259df0480a6073646faf1de81d88bf53dc6e43ae2949 plugin-sdk-api-baseline.json
|
||||
d76b67aa2618604da379147f44ac0746850bc5f5174404c979dc82ec6c45e05d plugin-sdk-api-baseline.jsonl
|
||||
2c665b045d30f690c5fd6adb89481a003d5cc55ab4eed1a0456ef47136f6b684 plugin-sdk-api-baseline.json
|
||||
f4b6c016576cd19409356ef23d18da0e54cb6c5904f864049461ace921e1f72c plugin-sdk-api-baseline.jsonl
|
||||
|
||||
@@ -328,24 +328,24 @@ OpenClaw reads this before provider runtime loads.
|
||||
Provider setup lists use these manifest choices, descriptor-derived setup
|
||||
choices, and install-catalog metadata without loading provider runtime.
|
||||
|
||||
| Field | Required | Type | What it means |
|
||||
| --------------------- | -------- | ----------------------------------------------- | -------------------------------------------------------------------------------------------------------- |
|
||||
| `provider` | Yes | `string` | Provider id this choice belongs to. |
|
||||
| `method` | Yes | `string` | Auth method id to dispatch to. |
|
||||
| `choiceId` | Yes | `string` | Stable auth-choice id used by onboarding and CLI flows. |
|
||||
| `choiceLabel` | No | `string` | User-facing label. If omitted, OpenClaw falls back to `choiceId`. |
|
||||
| `choiceHint` | No | `string` | Short helper text for the picker. |
|
||||
| `assistantPriority` | No | `number` | Lower values sort earlier in assistant-driven interactive pickers. |
|
||||
| `assistantVisibility` | No | `"visible"` \| `"manual-only"` | Hide the choice from assistant pickers while still allowing manual CLI selection. |
|
||||
| `deprecatedChoiceIds` | No | `string[]` | Legacy choice ids that should redirect users to this replacement choice. |
|
||||
| `groupId` | No | `string` | Optional group id for grouping related choices. |
|
||||
| `groupLabel` | No | `string` | User-facing label for that group. |
|
||||
| `groupHint` | No | `string` | Short helper text for the group. |
|
||||
| `optionKey` | No | `string` | Internal option key for simple one-flag auth flows. |
|
||||
| `cliFlag` | No | `string` | CLI flag name, such as `--openrouter-api-key`. |
|
||||
| `cliOption` | No | `string` | Full CLI option shape, such as `--openrouter-api-key <key>`. |
|
||||
| `cliDescription` | No | `string` | Description used in CLI help. |
|
||||
| `onboardingScopes` | No | `Array<"text-inference" \| "image-generation">` | Which onboarding surfaces this choice should appear in. If omitted, it defaults to `["text-inference"]`. |
|
||||
| Field | Required | Type | What it means |
|
||||
| --------------------- | -------- | --------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------- |
|
||||
| `provider` | Yes | `string` | Provider id this choice belongs to. |
|
||||
| `method` | Yes | `string` | Auth method id to dispatch to. |
|
||||
| `choiceId` | Yes | `string` | Stable auth-choice id used by onboarding and CLI flows. |
|
||||
| `choiceLabel` | No | `string` | User-facing label. If omitted, OpenClaw falls back to `choiceId`. |
|
||||
| `choiceHint` | No | `string` | Short helper text for the picker. |
|
||||
| `assistantPriority` | No | `number` | Lower values sort earlier in assistant-driven interactive pickers. |
|
||||
| `assistantVisibility` | No | `"visible"` \| `"manual-only"` | Hide the choice from assistant pickers while still allowing manual CLI selection. |
|
||||
| `deprecatedChoiceIds` | No | `string[]` | Legacy choice ids that should redirect users to this replacement choice. |
|
||||
| `groupId` | No | `string` | Optional group id for grouping related choices. |
|
||||
| `groupLabel` | No | `string` | User-facing label for that group. |
|
||||
| `groupHint` | No | `string` | Short helper text for the group. |
|
||||
| `optionKey` | No | `string` | Internal option key for simple one-flag auth flows. |
|
||||
| `cliFlag` | No | `string` | CLI flag name, such as `--openrouter-api-key`. |
|
||||
| `cliOption` | No | `string` | Full CLI option shape, such as `--openrouter-api-key <key>`. |
|
||||
| `cliDescription` | No | `string` | Description used in CLI help. |
|
||||
| `onboardingScopes` | No | `Array<"text-inference" \| "image-generation" \| "music-generation">` | Which onboarding surfaces this choice should appear in. If omitted, it defaults to `["text-inference"]`. |
|
||||
|
||||
## commandAliases reference
|
||||
|
||||
|
||||
@@ -16,7 +16,7 @@ Adds fal model provider support to OpenClaw.
|
||||
|
||||
## Surface
|
||||
|
||||
providers: fal; contracts: imageGenerationProviders, videoGenerationProviders
|
||||
providers: fal; contracts: imageGenerationProviders, musicGenerationProviders, videoGenerationProviders
|
||||
|
||||
## Related docs
|
||||
|
||||
|
||||
@@ -16,7 +16,7 @@ Adds OpenRouter model provider support to OpenClaw.
|
||||
|
||||
## Surface
|
||||
|
||||
providers: openrouter; contracts: imageGenerationProviders, mediaUnderstandingProviders, speechProviders, videoGenerationProviders
|
||||
providers: openrouter; contracts: imageGenerationProviders, mediaUnderstandingProviders, musicGenerationProviders, speechProviders, videoGenerationProviders
|
||||
|
||||
## Related docs
|
||||
|
||||
|
||||
@@ -1,13 +1,14 @@
|
||||
---
|
||||
summary: "fal image and video generation setup in OpenClaw"
|
||||
summary: "fal image, video, and music generation setup in OpenClaw"
|
||||
title: "Fal"
|
||||
read_when:
|
||||
- You want to use fal image generation in OpenClaw
|
||||
- You need the FAL_KEY auth flow
|
||||
- You want fal defaults for image_generate or video_generate
|
||||
- You want fal defaults for image_generate, video_generate, or music_generate
|
||||
---
|
||||
|
||||
OpenClaw ships a bundled `fal` provider for hosted image and video generation.
|
||||
OpenClaw ships a bundled `fal` provider for hosted image, video, and music
|
||||
generation.
|
||||
|
||||
| Property | Value |
|
||||
| -------- | ------------------------------------------------------------- |
|
||||
@@ -151,6 +152,35 @@ The bundled `fal` video-generation provider defaults to
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Music generation
|
||||
|
||||
The bundled `fal` plugin also registers a music-generation provider for the
|
||||
shared `music_generate` tool.
|
||||
|
||||
| Capability | Value |
|
||||
| ------------- | ------------------------------------------------------------------------------------------------------ |
|
||||
| Default model | `fal/fal-ai/minimax-music/v2.6` |
|
||||
| Models | `fal-ai/minimax-music/v2.6`, `fal-ai/ace-step/prompt-to-audio`, `fal-ai/stable-audio-25/text-to-audio` |
|
||||
| Runtime | Synchronous request plus generated audio download |
|
||||
|
||||
Use fal as the default music provider:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
musicGenerationModel: {
|
||||
primary: "fal/fal-ai/minimax-music/v2.6",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
`fal-ai/minimax-music/v2.6` supports explicit lyrics and instrumental mode.
|
||||
ACE-Step and Stable Audio are prompt-to-audio endpoints; choose them with the
|
||||
`model` override when you want those model families.
|
||||
|
||||
<Tip>
|
||||
Use `openclaw models list --provider fal` to see the full list of available fal
|
||||
models, including any recently added entries.
|
||||
@@ -165,7 +195,10 @@ models, including any recently added entries.
|
||||
<Card title="Video generation" href="/tools/video-generation" icon="video">
|
||||
Shared video tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="Music generation" href="/tools/music-generation" icon="music">
|
||||
Shared music tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="Configuration reference" href="/gateway/config-agents#agent-defaults" icon="gear">
|
||||
Agent defaults including image and video model selection.
|
||||
Agent defaults including image, video, and music model selection.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -4,6 +4,7 @@ read_when:
|
||||
- You want a single API key for many LLMs
|
||||
- You want to run models via OpenRouter in OpenClaw
|
||||
- You want to use OpenRouter for image generation
|
||||
- You want to use OpenRouter for music generation
|
||||
- You want to use OpenRouter for video generation
|
||||
title: "OpenRouter"
|
||||
---
|
||||
@@ -107,6 +108,34 @@ second durations, `720P`/`1080P` resolutions, and `16:9`/`9:16` aspect
|
||||
ratios. Video-to-video is not registered for OpenRouter because the upstream
|
||||
video generation API currently accepts text and image references.
|
||||
|
||||
## Music generation
|
||||
|
||||
OpenRouter can also back the `music_generate` tool through chat completions
|
||||
audio output. Use an OpenRouter audio model under
|
||||
`agents.defaults.musicGenerationModel`:
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { OPENROUTER_API_KEY: "sk-or-..." },
|
||||
agents: {
|
||||
defaults: {
|
||||
musicGenerationModel: {
|
||||
primary: "openrouter/google/lyria-3-pro-preview",
|
||||
timeoutMs: 180_000,
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
The bundled OpenRouter music provider defaults to
|
||||
`google/lyria-3-pro-preview` and also exposes
|
||||
`google/lyria-3-clip-preview`. OpenClaw sends `modalities: ["text",
|
||||
"audio"]`, enables streaming, collects the streamed audio chunks, and saves
|
||||
the result as generated media for channel delivery. Reference images are
|
||||
accepted for Lyria models through the shared `music_generate image=...`
|
||||
parameter.
|
||||
|
||||
## Text-to-speech
|
||||
|
||||
OpenRouter can also be used as a TTS provider through its OpenAI-compatible
|
||||
|
||||
@@ -60,7 +60,7 @@ telephony, meetings, browser realtime, and native push-to-talk clients.
|
||||
| DeepInfra | ✓ | ✓ | | ✓ | ✓ | | ✓ |
|
||||
| Deepgram | | | | | ✓ | ✓ | |
|
||||
| ElevenLabs | | | | ✓ | ✓ | | |
|
||||
| fal | ✓ | ✓ | | | | | |
|
||||
| fal | ✓ | ✓ | ✓ | | | | |
|
||||
| Google | ✓ | ✓ | ✓ | ✓ | | ✓ | ✓ |
|
||||
| Gradium | | | | ✓ | | | |
|
||||
| Local CLI | | | | ✓ | | | |
|
||||
@@ -68,7 +68,7 @@ telephony, meetings, browser realtime, and native push-to-talk clients.
|
||||
| MiniMax | ✓ | ✓ | ✓ | ✓ | | | |
|
||||
| Mistral | | | | | ✓ | | |
|
||||
| OpenAI | ✓ | ✓ | | ✓ | ✓ | ✓ | ✓ |
|
||||
| OpenRouter | ✓ | ✓ | | ✓ | ✓ | | ✓ |
|
||||
| OpenRouter | ✓ | ✓ | ✓ | ✓ | ✓ | | ✓ |
|
||||
| Qwen | | ✓ | | | | | |
|
||||
| Runway | | ✓ | | | | | |
|
||||
| SenseAudio | | | | | ✓ | | |
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
summary: "Generate music via music_generate across Google Lyria, MiniMax, and ComfyUI workflows"
|
||||
summary: "Generate music via music_generate across ComfyUI, fal, Google Lyria, MiniMax, and OpenRouter workflows"
|
||||
read_when:
|
||||
- Generating music or audio via the agent
|
||||
- Configuring music-generation providers and models
|
||||
@@ -9,8 +9,8 @@ sidebarTitle: "Music generation"
|
||||
---
|
||||
|
||||
The `music_generate` tool lets the agent create music or audio through the
|
||||
shared music-generation capability with configured providers — Google,
|
||||
MiniMax, and workflow-configured ComfyUI today.
|
||||
shared music-generation capability with configured providers — ComfyUI,
|
||||
fal, Google, MiniMax, and OpenRouter today.
|
||||
|
||||
For session-backed agent runs, OpenClaw starts music generation as a
|
||||
background task, tracks it in the task ledger, then wakes the agent again
|
||||
@@ -94,22 +94,26 @@ Generate an energetic chiptune loop about launching a rocket at sunrise.
|
||||
|
||||
## Supported providers
|
||||
|
||||
| Provider | Default model | Reference inputs | Supported controls | Auth |
|
||||
| -------- | ---------------------- | ---------------- | --------------------------------------------------------- | -------------------------------------- |
|
||||
| ComfyUI | `workflow` | Up to 1 image | Workflow-defined music or audio | `COMFY_API_KEY`, `COMFY_CLOUD_API_KEY` |
|
||||
| Google | `lyria-3-clip-preview` | Up to 10 images | `lyrics`, `instrumental`, `format` | `GEMINI_API_KEY`, `GOOGLE_API_KEY` |
|
||||
| MiniMax | `music-2.6` | None | `lyrics`, `instrumental`, `durationSeconds`, `format=mp3` | `MINIMAX_API_KEY` or MiniMax OAuth |
|
||||
| Provider | Default model | Reference inputs | Supported controls | Auth |
|
||||
| ---------- | ---------------------------- | ---------------- | --------------------------------------------------------- | -------------------------------------- |
|
||||
| ComfyUI | `workflow` | Up to 1 image | Workflow-defined music or audio | `COMFY_API_KEY`, `COMFY_CLOUD_API_KEY` |
|
||||
| fal | `fal-ai/minimax-music/v2.6` | None | `lyrics`, `instrumental`, `durationSeconds`, `format` | `FAL_KEY` or `FAL_API_KEY` |
|
||||
| Google | `lyria-3-clip-preview` | Up to 10 images | `lyrics`, `instrumental`, `format` | `GEMINI_API_KEY`, `GOOGLE_API_KEY` |
|
||||
| MiniMax | `music-2.6` | None | `lyrics`, `instrumental`, `durationSeconds`, `format=mp3` | `MINIMAX_API_KEY` or MiniMax OAuth |
|
||||
| OpenRouter | `google/lyria-3-pro-preview` | Up to 1 image | `lyrics`, `instrumental`, `durationSeconds`, `format` | `OPENROUTER_API_KEY` |
|
||||
|
||||
### Capability matrix
|
||||
|
||||
The explicit mode contract used by `music_generate`, contract tests, and the
|
||||
shared live sweep:
|
||||
|
||||
| Provider | `generate` | `edit` | Edit limit | Shared live lanes |
|
||||
| -------- | :--------: | :----: | ---------- | ------------------------------------------------------------------------- |
|
||||
| ComfyUI | ✓ | ✓ | 1 image | Not in the shared sweep; covered by `extensions/comfy/comfy.live.test.ts` |
|
||||
| Google | ✓ | ✓ | 10 images | `generate`, `edit` |
|
||||
| MiniMax | ✓ | — | None | `generate` |
|
||||
| Provider | `generate` | `edit` | Edit limit | Shared live lanes |
|
||||
| ---------- | :--------: | :----: | ---------- | ------------------------------------------------------------------------- |
|
||||
| ComfyUI | ✓ | ✓ | 1 image | Not in the shared sweep; covered by `extensions/comfy/comfy.live.test.ts` |
|
||||
| fal | ✓ | — | None | `generate` |
|
||||
| Google | ✓ | ✓ | 10 images | `generate`, `edit` |
|
||||
| MiniMax | ✓ | — | None | `generate` |
|
||||
| OpenRouter | ✓ | ✓ | 1 image | `generate`, `edit` |
|
||||
|
||||
Use `action: "list"` to inspect available shared providers and models at
|
||||
runtime:
|
||||
@@ -225,7 +229,7 @@ openclaw tasks cancel <taskId>
|
||||
defaults: {
|
||||
musicGenerationModel: {
|
||||
primary: "google/lyria-3-clip-preview",
|
||||
fallbacks: ["minimax/music-2.6"],
|
||||
fallbacks: ["fal/fal-ai/minimax-music/v2.6", "minimax/music-2.6"],
|
||||
},
|
||||
},
|
||||
},
|
||||
@@ -258,6 +262,12 @@ explicit `model`, `primary`, and `fallbacks` entries.
|
||||
shared `music_generate` tool through the music-generation provider
|
||||
registry.
|
||||
</Accordion>
|
||||
<Accordion title="fal">
|
||||
Uses fal model endpoints through the shared provider auth path. The
|
||||
bundled provider defaults to `fal-ai/minimax-music/v2.6` and also exposes
|
||||
`fal-ai/ace-step/prompt-to-audio` and
|
||||
`fal-ai/stable-audio-25/text-to-audio` for prompt-to-audio requests.
|
||||
</Accordion>
|
||||
<Accordion title="Google (Lyria 3)">
|
||||
Uses Lyria 3 batch generation. The current bundled flow supports
|
||||
prompt, optional lyrics text, and optional reference images.
|
||||
@@ -267,6 +277,11 @@ explicit `model`, `primary`, and `fallbacks` entries.
|
||||
lyrics, instrumental mode, duration steering, and mp3 output through
|
||||
either `minimax` API-key auth or `minimax-portal` OAuth.
|
||||
</Accordion>
|
||||
<Accordion title="OpenRouter">
|
||||
Uses OpenRouter chat completions audio output with streaming enabled. The
|
||||
bundled provider defaults to `google/lyria-3-pro-preview` and also exposes
|
||||
`openrouter/google/lyria-3-clip-preview`.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Choosing the right path
|
||||
@@ -278,8 +293,8 @@ explicit `model`, `primary`, and `fallbacks` entries.
|
||||
|
||||
If you are debugging ComfyUI-specific behavior, see
|
||||
[ComfyUI](/providers/comfy). If you are debugging shared provider
|
||||
behavior, start with [Google (Gemini)](/providers/google) or
|
||||
[MiniMax](/providers/minimax).
|
||||
behavior, start with [fal](/providers/fal), [Google (Gemini)](/providers/google),
|
||||
[MiniMax](/providers/minimax), or [OpenRouter](/providers/openrouter).
|
||||
|
||||
## Provider capability modes
|
||||
|
||||
@@ -331,7 +346,9 @@ profiles by default, and runs both `generate` and declared `edit` coverage when
|
||||
the provider enables edit mode. Coverage today:
|
||||
|
||||
- `google`: `generate` plus `edit`
|
||||
- `fal`: `generate` only
|
||||
- `minimax`: `generate` only
|
||||
- `openrouter`: `generate` plus `edit`
|
||||
- `comfy`: separate Comfy live coverage, not the shared provider sweep
|
||||
|
||||
Opt-in live coverage for the bundled ComfyUI music path:
|
||||
|
||||
Reference in New Issue
Block a user