feat: add xai media providers

Add xAI image generation and text-to-speech provider support with docs, live tests, and guarded provider HTTP handling.\n\nThanks @KateWilkins.
2026-05-06 14:10:51 +00:00 · 2026-04-23 00:07:39 +01:00
parent 386a0884d7
commit f342da5fcc
21 changed files with 1369 additions and 26 deletions
--- a/docs/tools/image-generation.md
+++ b/docs/tools/image-generation.md
@@ -1,5 +1,5 @@
 ---
-summary: "Generate and edit images using configured providers (OpenAI, Google Gemini, fal, MiniMax, ComfyUI, Vydra)"
+summary: "Generate and edit images using configured providers (OpenAI, Google Gemini, fal, MiniMax, ComfyUI, Vydra, xAI)"
 read_when:
  - Generating images via the agent
  - Configuring image generation providers and models
@@ -46,6 +46,7 @@ The agent calls `image_generate` automatically. No tool allow-listing needed —
 | MiniMax  | `image-01`                       | Yes (subject reference)            | `MINIMAX_API_KEY` or MiniMax OAuth (`minimax-portal`) |
 | ComfyUI  | `workflow`                       | Yes (1 image, workflow-configured) | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for cloud    |
 | Vydra    | `grok-imagine`                   | No                                 | `VYDRA_API_KEY`                                       |
+| xAI      | `grok-imagine-image`             | Yes (up to 5 images)               | `XAI_API_KEY`                                         |

 Use `action: "list"` to inspect available providers and models at runtime:

@@ -115,13 +116,13 @@ Notes:

 ### Image editing

-OpenAI, Google, fal, MiniMax, and ComfyUI support editing reference images. Pass a reference image path or URL:
+OpenAI, Google, fal, MiniMax, ComfyUI, and xAI support editing reference images. Pass a reference image path or URL:

 ```
 "Generate a watercolor version of this photo" + image: "/path/to/photo.jpg"
 ```

-OpenAI and Google support up to 5 reference images via the `images` parameter. fal, MiniMax, and ComfyUI support 1.
+OpenAI, Google, and xAI support up to 5 reference images via the `images` parameter. fal, MiniMax, and ComfyUI support 1.

 ### OpenAI `gpt-image-2`

@@ -166,13 +167,29 @@ MiniMax image generation is available through both bundled MiniMax auth paths:

 ## Provider capabilities

-| Capability            | OpenAI               | Google               | fal                 | MiniMax                    | ComfyUI                            | Vydra   |
-| --------------------- | -------------------- | -------------------- | ------------------- | -------------------------- | ---------------------------------- | ------- |
-| Generate              | Yes (up to 4)        | Yes (up to 4)        | Yes (up to 4)       | Yes (up to 9)              | Yes (workflow-defined outputs)     | Yes (1) |
-| Edit/reference        | Yes (up to 5 images) | Yes (up to 5 images) | Yes (1 image)       | Yes (1 image, subject ref) | Yes (1 image, workflow-configured) | No      |
-| Size control          | Yes (up to 4K)       | Yes                  | Yes                 | No                         | No                                 | No      |
-| Aspect ratio          | No                   | Yes                  | Yes (generate only) | Yes                        | No                                 | No      |
-| Resolution (1K/2K/4K) | No                   | Yes                  | Yes                 | No                         | No                                 | No      |
+| Capability            | OpenAI               | Google               | fal                 | MiniMax                    | ComfyUI                            | Vydra   | xAI                  |
+| --------------------- | -------------------- | -------------------- | ------------------- | -------------------------- | ---------------------------------- | ------- | -------------------- |
+| Generate              | Yes (up to 4)        | Yes (up to 4)        | Yes (up to 4)       | Yes (up to 9)              | Yes (workflow-defined outputs)     | Yes (1) | Yes (up to 4)        |
+| Edit/reference        | Yes (up to 5 images) | Yes (up to 5 images) | Yes (1 image)       | Yes (1 image, subject ref) | Yes (1 image, workflow-configured) | No      | Yes (up to 5 images) |
+| Size control          | Yes (up to 4K)       | Yes                  | Yes                 | No                         | No                                 | No      | No                   |
+| Aspect ratio          | No                   | Yes                  | Yes (generate only) | Yes                        | No                                 | No      | Yes                  |
+| Resolution (1K/2K/4K) | No                   | Yes                  | Yes                 | No                         | No                                 | No      | Yes (1K/2K)          |
+
+### xAI `grok-imagine-image`
+
+The bundled xAI provider uses `/v1/images/generations` for prompt-only requests
+and `/v1/images/edits` when `image` or `images` is present.
+
+- Models: `xai/grok-imagine-image`, `xai/grok-imagine-image-pro`
+- Count: up to 4
+- References: one `image` or up to five `images`
+- Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2`
+- Resolutions: `1K`, `2K`
+- Outputs: returned as OpenClaw-managed image attachments
+
+OpenClaw intentionally does not expose xAI-native `quality`, `mask`, `user`, or
+extra native-only aspect ratios until those controls exist in the shared
+cross-provider `image_generate` contract.

 ## Related

@@ -183,5 +200,6 @@ MiniMax image generation is available through both bundled MiniMax auth paths:
 - [MiniMax](/providers/minimax) — MiniMax image provider setup
 - [OpenAI](/providers/openai) — OpenAI Images provider setup
 - [Vydra](/providers/vydra) — Vydra image, video, and speech setup
+- [xAI](/providers/xai) — Grok image, video, search, code execution, and TTS setup
 - [Configuration Reference](/gateway/configuration-reference#agent-defaults) — `imageGenerationModel` config
 - [Models](/concepts/models) — model configuration and failover