feat: add xai media providers

Add xAI image generation and text-to-speech provider support with docs, live tests, and guarded provider HTTP handling.\n\nThanks @KateWilkins.
This commit is contained in:
KateWilkins
2026-04-23 00:07:39 +01:00
committed by GitHub
parent 386a0884d7
commit f342da5fcc
21 changed files with 1369 additions and 26 deletions

View File

@@ -1,5 +1,5 @@
---
summary: "Generate and edit images using configured providers (OpenAI, Google Gemini, fal, MiniMax, ComfyUI, Vydra)"
summary: "Generate and edit images using configured providers (OpenAI, Google Gemini, fal, MiniMax, ComfyUI, Vydra, xAI)"
read_when:
- Generating images via the agent
- Configuring image generation providers and models
@@ -46,6 +46,7 @@ The agent calls `image_generate` automatically. No tool allow-listing needed —
| MiniMax | `image-01` | Yes (subject reference) | `MINIMAX_API_KEY` or MiniMax OAuth (`minimax-portal`) |
| ComfyUI | `workflow` | Yes (1 image, workflow-configured) | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for cloud |
| Vydra | `grok-imagine` | No | `VYDRA_API_KEY` |
| xAI | `grok-imagine-image` | Yes (up to 5 images) | `XAI_API_KEY` |
Use `action: "list"` to inspect available providers and models at runtime:
@@ -115,13 +116,13 @@ Notes:
### Image editing
OpenAI, Google, fal, MiniMax, and ComfyUI support editing reference images. Pass a reference image path or URL:
OpenAI, Google, fal, MiniMax, ComfyUI, and xAI support editing reference images. Pass a reference image path or URL:
```
"Generate a watercolor version of this photo" + image: "/path/to/photo.jpg"
```
OpenAI and Google support up to 5 reference images via the `images` parameter. fal, MiniMax, and ComfyUI support 1.
OpenAI, Google, and xAI support up to 5 reference images via the `images` parameter. fal, MiniMax, and ComfyUI support 1.
### OpenAI `gpt-image-2`
@@ -166,13 +167,29 @@ MiniMax image generation is available through both bundled MiniMax auth paths:
## Provider capabilities
| Capability | OpenAI | Google | fal | MiniMax | ComfyUI | Vydra |
| --------------------- | -------------------- | -------------------- | ------------------- | -------------------------- | ---------------------------------- | ------- |
| Generate | Yes (up to 4) | Yes (up to 4) | Yes (up to 4) | Yes (up to 9) | Yes (workflow-defined outputs) | Yes (1) |
| Edit/reference | Yes (up to 5 images) | Yes (up to 5 images) | Yes (1 image) | Yes (1 image, subject ref) | Yes (1 image, workflow-configured) | No |
| Size control | Yes (up to 4K) | Yes | Yes | No | No | No |
| Aspect ratio | No | Yes | Yes (generate only) | Yes | No | No |
| Resolution (1K/2K/4K) | No | Yes | Yes | No | No | No |
| Capability | OpenAI | Google | fal | MiniMax | ComfyUI | Vydra | xAI |
| --------------------- | -------------------- | -------------------- | ------------------- | -------------------------- | ---------------------------------- | ------- | -------------------- |
| Generate | Yes (up to 4) | Yes (up to 4) | Yes (up to 4) | Yes (up to 9) | Yes (workflow-defined outputs) | Yes (1) | Yes (up to 4) |
| Edit/reference | Yes (up to 5 images) | Yes (up to 5 images) | Yes (1 image) | Yes (1 image, subject ref) | Yes (1 image, workflow-configured) | No | Yes (up to 5 images) |
| Size control | Yes (up to 4K) | Yes | Yes | No | No | No | No |
| Aspect ratio | No | Yes | Yes (generate only) | Yes | No | No | Yes |
| Resolution (1K/2K/4K) | No | Yes | Yes | No | No | No | Yes (1K/2K) |
### xAI `grok-imagine-image`
The bundled xAI provider uses `/v1/images/generations` for prompt-only requests
and `/v1/images/edits` when `image` or `images` is present.
- Models: `xai/grok-imagine-image`, `xai/grok-imagine-image-pro`
- Count: up to 4
- References: one `image` or up to five `images`
- Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2`
- Resolutions: `1K`, `2K`
- Outputs: returned as OpenClaw-managed image attachments
OpenClaw intentionally does not expose xAI-native `quality`, `mask`, `user`, or
extra native-only aspect ratios until those controls exist in the shared
cross-provider `image_generate` contract.
## Related
@@ -183,5 +200,6 @@ MiniMax image generation is available through both bundled MiniMax auth paths:
- [MiniMax](/providers/minimax) — MiniMax image provider setup
- [OpenAI](/providers/openai) — OpenAI Images provider setup
- [Vydra](/providers/vydra) — Vydra image, video, and speech setup
- [xAI](/providers/xai) — Grok image, video, search, code execution, and TTS setup
- [Configuration Reference](/gateway/configuration-reference#agent-defaults) — `imageGenerationModel` config
- [Models](/concepts/models) — model configuration and failover