docs(image-generation): rewrite around Steps, Tabs, and AZ providers

The image-generation page was 395 lines with a 3-step quick-start written as plain numbered prose, a sprawling 'OpenAI gpt-image-2' section that mixed routing/legacy/OpenAI options with five inline slash-command examples, and provider tables that mixed alphabetic and recency order. Restructure for scan-first reading without losing technical content: - Wrap Quick start in a Steps component (auth -> default model -> ask the agent), pulling the Codex OAuth note inline with the model step where it belongs and surfacing the LAN/SSRF caveat as a Warning callout. - Alphabetize the Supported providers table (ComfyUI, fal, Google, LiteLLM, MiniMax, OpenAI, OpenRouter, Vydra, xAI) and the Provider capabilities table (same order across both). Convert the Yes/No capability table to checkmarks plus exact counts for readability. - Replace the long inline OpenAI / OpenRouter / MiniMax / xAI prose with a 'Provider deep dives' AccordionGroup so each backend's routing, legacy URL handling, and provider-specific knobs collapse by default. - Move the four provider-selection-order notes into a small AccordionGroup ('Per-call overrides are exact', 'Auto-detection is auth-aware', 'Timeouts', 'Inspect at runtime'). - Collapse the five flat slash-command examples into a single Tabs component (4K landscape / transparent PNG / two-square / edit-one-ref / edit-multi-ref) with the matching CLI variant inline on the transparent-PNG tab. - Sentence-case the Related list (Tools overview, Configuration reference) and drop the redundant generic introductory wording. - Add sidebarTitle so the nav reads 'Image generation' explicitly. Wording, schema fields, defaults, model refs, env vars, and the detailed OpenAI/OpenRouter/Codex routing rules are unchanged.
2026-05-06 10:10:45 +00:00 · 2026-04-25 22:23:00 -07:00
parent 5d3168c343
commit f0ea901a0d
1 changed files with 271 additions and 258 deletions
--- a/docs/tools/image-generation.md
+++ b/docs/tools/image-generation.md
@@ -1,50 +1,68 @@
 ---
-summary: "Generate and edit images using configured providers (OpenAI, OpenAI Codex OAuth, Google Gemini, OpenRouter, LiteLLM, fal, MiniMax, ComfyUI, Vydra, xAI)"
+summary: "Generate and edit images via image_generate across OpenAI, Google, fal, MiniMax, ComfyUI, OpenRouter, LiteLLM, xAI, Vydra"
 read_when:
-  - Generating images via the agent
-  - Configuring image generation providers and models
+  - Generating or editing images via the agent
+  - Configuring image-generation providers and models
  - Understanding the image_generate tool parameters
 title: "Image generation"
+sidebarTitle: "Image generation"
 ---

-The `image_generate` tool lets the agent create and edit images using your configured providers. Generated images are delivered automatically as media attachments in the agent's reply.
+The `image_generate` tool lets the agent create and edit images using your
+configured providers. Generated images are delivered automatically as media
+attachments in the agent's reply.

 <Note>
-The tool only appears when at least one image generation provider is available. If you don't see `image_generate` in your agent's tools, configure `agents.defaults.imageGenerationModel`, set up a provider API key, or sign in with OpenAI Codex OAuth.
+The tool only appears when at least one image-generation provider is
+available. If you do not see `image_generate` in your agent's tools,
+configure `agents.defaults.imageGenerationModel`, set up a provider API key,
+or sign in with OpenAI Codex OAuth.
 </Note>

 ## Quick start

-1. Set an API key for at least one provider (for example `OPENAI_API_KEY`, `GEMINI_API_KEY`, or `OPENROUTER_API_KEY`) or sign in with OpenAI Codex OAuth.
-2. Optionally set your preferred model:
-
-```json5
-{
-  agents: {
-    defaults: {
-      imageGenerationModel: {
-        primary: "openai/gpt-image-2",
-        // Optional default provider request timeout for image_generate.
-        timeoutMs: 180_000,
+<Steps>
+  <Step title="Configure auth">
+    Set an API key for at least one provider (for example `OPENAI_API_KEY`,
+    `GEMINI_API_KEY`, `OPENROUTER_API_KEY`) or sign in with OpenAI Codex OAuth.
+  </Step>
+  <Step title="Pick a default model (optional)">
+    ```json5
+    {
+      agents: {
+        defaults: {
+          imageGenerationModel: {
+            primary: "openai/gpt-image-2",
+            timeoutMs: 180_000,
+          },
+        },
      },
-    },
-  },
-}
-```
+    }
+    ```

-Codex OAuth uses the same `openai/gpt-image-2` model ref. When an
-`openai-codex` OAuth profile is configured, OpenClaw routes image requests
-through that same OAuth profile instead of first trying `OPENAI_API_KEY`.
-Explicit custom `models.providers.openai` image config, such as an API key or
-custom/Azure base URL, opts back into the direct OpenAI Images API route.
+    Codex OAuth uses the same `openai/gpt-image-2` model ref. When an
+    `openai-codex` OAuth profile is configured, OpenClaw routes image
+    requests through that OAuth profile instead of first trying
+    `OPENAI_API_KEY`. Explicit `models.providers.openai` config (API key,
+    custom/Azure base URL) opts back into the direct OpenAI Images API
+    route.
+
+  </Step>
+  <Step title="Ask the agent">
+    _"Generate an image of a friendly robot mascot."_
+
+    The agent calls `image_generate` automatically. No tool allow-listing
+    needed — it is enabled by default when a provider is available.
+
+  </Step>
+</Steps>
+
+<Warning>
 For OpenAI-compatible LAN endpoints such as LocalAI, keep the custom
 `models.providers.openai.baseUrl` and explicitly opt in with
-`browser.ssrfPolicy.dangerouslyAllowPrivateNetwork: true`; private/internal
-image endpoints remain blocked by default.
-
-3. Ask the agent: _"Generate an image of a friendly robot mascot."_
-
-The agent calls `image_generate` automatically. No tool allow-listing needed — it's enabled by default when a provider is available.
+`browser.ssrfPolicy.dangerouslyAllowPrivateNetwork: true`. Private and
+internal image endpoints remain blocked by default.
+</Warning>

 ## Common routes

@@ -61,97 +79,91 @@ The same `image_generate` tool handles text-to-image and reference-image
 editing. Use `image` for one reference or `images` for multiple references.
 Provider-supported output hints such as `quality`, `outputFormat`, and
 `background` are forwarded when available and reported as ignored when a
-provider does not support them. Current bundled transparent-background support
-is OpenAI-specific; other providers may still preserve PNG alpha if their
+provider does not support them. Bundled transparent-background support is
+OpenAI-specific; other providers may still preserve PNG alpha if their
 backend emits it.

 ## Supported providers

 | Provider   | Default model                           | Edit support                       | Auth                                                  |
 | ---------- | --------------------------------------- | ---------------------------------- | ----------------------------------------------------- |
+| ComfyUI    | `workflow`                              | Yes (1 image, workflow-configured) | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for cloud    |
+| fal        | `fal-ai/flux/dev`                       | Yes                                | `FAL_KEY`                                             |
+| Google     | `gemini-3.1-flash-image-preview`        | Yes                                | `GEMINI_API_KEY` or `GOOGLE_API_KEY`                  |
+| LiteLLM    | `gpt-image-2`                           | Yes (up to 5 input images)         | `LITELLM_API_KEY`                                     |
+| MiniMax    | `image-01`                              | Yes (subject reference)            | `MINIMAX_API_KEY` or MiniMax OAuth (`minimax-portal`) |
 | OpenAI     | `gpt-image-2`                           | Yes (up to 4 images)               | `OPENAI_API_KEY` or OpenAI Codex OAuth                |
 | OpenRouter | `google/gemini-3.1-flash-image-preview` | Yes (up to 5 input images)         | `OPENROUTER_API_KEY`                                  |
-| LiteLLM    | `gpt-image-2`                           | Yes (up to 5 input images)         | `LITELLM_API_KEY`                                     |
-| Google     | `gemini-3.1-flash-image-preview`        | Yes                                | `GEMINI_API_KEY` or `GOOGLE_API_KEY`                  |
-| fal        | `fal-ai/flux/dev`                       | Yes                                | `FAL_KEY`                                             |
-| MiniMax    | `image-01`                              | Yes (subject reference)            | `MINIMAX_API_KEY` or MiniMax OAuth (`minimax-portal`) |
-| ComfyUI    | `workflow`                              | Yes (1 image, workflow-configured) | `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for cloud    |
 | Vydra      | `grok-imagine`                          | No                                 | `VYDRA_API_KEY`                                       |
 | xAI        | `grok-imagine-image`                    | Yes (up to 5 images)               | `XAI_API_KEY`                                         |

 Use `action: "list"` to inspect available providers and models at runtime:

-```
+```text
 /tool image_generate action=list
 ```

+## Provider capabilities
+
+| Capability            | ComfyUI            | fal               | Google         | MiniMax               | OpenAI         | Vydra | xAI            |
+| --------------------- | ------------------ | ----------------- | -------------- | --------------------- | -------------- | ----- | -------------- |
+| Generate (max count)  | Workflow-defined   | 4                 | 4              | 9                     | 4              | 1     | 4              |
+| Edit / reference      | 1 image (workflow) | 1 image           | Up to 5 images | 1 image (subject ref) | Up to 5 images | —     | Up to 5 images |
+| Size control          | —                  | ✓                 | ✓              | —                     | Up to 4K       | —     | —              |
+| Aspect ratio          | —                  | ✓ (generate only) | ✓              | ✓                     | —              | —     | ✓              |
+| Resolution (1K/2K/4K) | —                  | ✓                 | ✓              | —                     | —              | —     | 1K, 2K         |
+
 ## Tool parameters

 <ParamField path="prompt" type="string" required>
-Image generation prompt. Required for `action: "generate"`.
+  Image generation prompt. Required for `action: "generate"`.
 </ParamField>
-
-<ParamField path="action" type="'generate' | 'list'" default="generate">
-Use `"list"` to inspect available providers and models at runtime.
+<ParamField path="action" type='"generate" | "list"' default="generate">
+  Use `"list"` to inspect available providers and models at runtime.
 </ParamField>
-
 <ParamField path="model" type="string">
-Provider/model override, e.g. `openai/gpt-image-2`; use
-`openai/gpt-image-1.5` for transparent OpenAI backgrounds.
+  Provider/model override (e.g. `openai/gpt-image-2`). Use
+  `openai/gpt-image-1.5` for transparent OpenAI backgrounds.
 </ParamField>
-
 <ParamField path="image" type="string">
-Single reference image path or URL for edit mode.
+  Single reference image path or URL for edit mode.
 </ParamField>
-
 <ParamField path="images" type="string[]">
-Multiple reference images for edit mode (up to 5).
+  Multiple reference images for edit mode (up to 5 on supporting providers).
 </ParamField>
-
 <ParamField path="size" type="string">
-Size hint: `1024x1024`, `1536x1024`, `1024x1536`, `2048x2048`, `3840x2160`.
+  Size hint: `1024x1024`, `1536x1024`, `1024x1536`, `2048x2048`, `3840x2160`.
 </ParamField>
-
 <ParamField path="aspectRatio" type="string">
-Aspect ratio: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`.
+  Aspect ratio: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9`.
 </ParamField>
-
-<ParamField path="resolution" type="'1K' | '2K' | '4K'">
-Resolution hint.
+<ParamField path="resolution" type='"1K" | "2K" | "4K"'>Resolution hint.</ParamField>
+<ParamField path="quality" type='"low" | "medium" | "high" | "auto"'>
+  Quality hint when the provider supports it.
 </ParamField>
-
-<ParamField path="quality" type="'low' | 'medium' | 'high' | 'auto'">
-Quality hint when the provider supports it.
+<ParamField path="outputFormat" type='"png" | "jpeg" | "webp"'>
+  Output format hint when the provider supports it.
 </ParamField>
-
-<ParamField path="outputFormat" type="'png' | 'jpeg' | 'webp'">
-Output format hint when the provider supports it.
+<ParamField path="background" type='"transparent" | "opaque" | "auto"'>
+  Background hint when the provider supports it. Use `transparent` with
+  `outputFormat: "png"` or `"webp"` for transparency-capable providers.
 </ParamField>
-
-<ParamField path="background" type="'transparent' | 'opaque' | 'auto'">
-Background hint when the provider supports it. Use `transparent` with
-`outputFormat: "png"` or `"webp"` for transparency-capable providers.
-</ParamField>
-
-<ParamField path="count" type="number">
-Number of images to generate (1–4).
-</ParamField>
-
-<ParamField path="timeoutMs" type="number">
-Optional provider request timeout in milliseconds.
-</ParamField>
-
-<ParamField path="filename" type="string">
-Output filename hint.
-</ParamField>
-
+<ParamField path="count" type="number">Number of images to generate (1–4).</ParamField>
+<ParamField path="timeoutMs" type="number">Optional provider request timeout in milliseconds.</ParamField>
+<ParamField path="filename" type="string">Output filename hint.</ParamField>
 <ParamField path="openai" type="object">
-OpenAI-only hints: `background`, `moderation`, `outputCompression`, and `user`.
+  OpenAI-only hints: `background`, `moderation`, `outputCompression`, and `user`.
 </ParamField>

-Not all providers support all parameters. When a fallback provider supports a nearby geometry option instead of the exact requested one, OpenClaw remaps to the closest supported size, aspect ratio, or resolution before submission. Unsupported output hints such as `quality` or `outputFormat` are dropped for providers that do not declare support and are reported in the tool result.
-
-Tool results report the applied settings. When OpenClaw remaps geometry during provider fallback, the returned `size`, `aspectRatio`, and `resolution` values reflect what was actually sent, and `details.normalization` captures the requested-to-applied translation.
+<Note>
+Not all providers support all parameters. When a fallback provider supports a
+nearby geometry option instead of the exact requested one, OpenClaw remaps to
+the closest supported size, aspect ratio, or resolution before submission.
+Unsupported output hints are dropped for providers that do not declare
+support and reported in the tool result. Tool results report the applied
+settings; `details.normalization` captures any requested-to-applied
+translation.
+</Note>

 ## Configuration

@@ -177,129 +189,177 @@ Tool results report the applied settings. When OpenClaw remaps geometry during p

 ### Provider selection order

-When generating an image, OpenClaw tries providers in this order:
+OpenClaw tries providers in this order:

-1. **`model` parameter** from the tool call (if the agent specifies one)
-2. **`imageGenerationModel.primary`** from config
-3. **`imageGenerationModel.fallbacks`** in order
-4. **Auto-detection** — uses auth-backed provider defaults only:
-   - current default provider first
-   - remaining registered image-generation providers in provider-id order
+1. **`model` parameter** from the tool call (if the agent specifies one).
+2. **`imageGenerationModel.primary`** from config.
+3. **`imageGenerationModel.fallbacks`** in order.
+4. **Auto-detection** — auth-backed provider defaults only:
+   - current default provider first;
+   - remaining registered image-generation providers in provider-id order.

-If a provider fails (auth error, rate limit, etc.), the next configured candidate is tried automatically. If all fail, the error includes details from each attempt.
+If a provider fails (auth error, rate limit, etc.), the next configured
+candidate is tried automatically. If all fail, the error includes details
+from each attempt.

-Notes:
-
- A per-call `model` override is exact: OpenClaw tries only that provider/model
-  and does not continue to configured primary/fallback or auto-detected
-  providers.
- Auto-detection is auth-aware. A provider default only enters the candidate list
-  when OpenClaw can actually authenticate that provider.
- Auto-detection is enabled by default. Set
-  `agents.defaults.mediaGenerationAutoProviderFallback: false` if you want image
-  generation to use only the explicit `model`, `primary`, and `fallbacks`
-  entries.
- Set `agents.defaults.imageGenerationModel.timeoutMs` for slow image backends.
-  A per-call `timeoutMs` tool parameter overrides the configured default.
- Use `action: "list"` to inspect the currently registered providers, their
-  default models, and auth env-var hints.
+<AccordionGroup>
+  <Accordion title="Per-call model overrides are exact">
+    A per-call `model` override tries only that provider/model and does
+    not continue to configured primary/fallback or auto-detected providers.
+  </Accordion>
+  <Accordion title="Auto-detection is auth-aware">
+    A provider default only enters the candidate list when OpenClaw can
+    actually authenticate that provider. Set
+    `agents.defaults.mediaGenerationAutoProviderFallback: false` to use only
+    explicit `model`, `primary`, and `fallbacks` entries.
+  </Accordion>
+  <Accordion title="Timeouts">
+    Set `agents.defaults.imageGenerationModel.timeoutMs` for slow image
+    backends. A per-call `timeoutMs` tool parameter overrides the configured
+    default.
+  </Accordion>
+  <Accordion title="Inspect at runtime">
+    Use `action: "list"` to inspect the currently registered providers,
+    their default models, and auth env-var hints.
+  </Accordion>
+</AccordionGroup>

 ### Image editing

-OpenAI, OpenRouter, Google, fal, MiniMax, ComfyUI, and xAI support editing reference images. Pass a reference image path or URL:
+OpenAI, OpenRouter, Google, fal, MiniMax, ComfyUI, and xAI support editing
+reference images. Pass a reference image path or URL:

-```
+```text
 "Generate a watercolor version of this photo" + image: "/path/to/photo.jpg"
 ```

-OpenAI, OpenRouter, Google, and xAI support up to 5 reference images via the `images` parameter. fal, MiniMax, and ComfyUI support 1.
+OpenAI, OpenRouter, Google, and xAI support up to 5 reference images via the
+`images` parameter. fal, MiniMax, and ComfyUI support 1.

-### OpenRouter image models
+## Provider deep dives

-OpenRouter image generation uses the same `OPENROUTER_API_KEY` and routes through OpenRouter's chat completions image API. Select OpenRouter image models with the `openrouter/` prefix:
+<AccordionGroup>
+  <Accordion title="OpenAI gpt-image-2 (and gpt-image-1.5)">
+    OpenAI image generation defaults to `openai/gpt-image-2`. If an
+    `openai-codex` OAuth profile is configured, OpenClaw reuses the same
+    OAuth profile used by Codex subscription chat models and sends the
+    image request through the Codex Responses backend. Legacy Codex base
+    URLs such as `https://chatgpt.com/backend-api` are canonicalized to
+    `https://chatgpt.com/backend-api/codex` for image requests. OpenClaw
+    does **not** silently fall back to `OPENAI_API_KEY` for that request —
+    to force direct OpenAI Images API routing, configure
+    `models.providers.openai` explicitly with an API key, custom base URL,
+    or Azure endpoint.

-```json5
-{
-  agents: {
-    defaults: {
-      imageGenerationModel: {
-        primary: "openrouter/google/gemini-3.1-flash-image-preview",
+    The `openai/gpt-image-1.5`, `openai/gpt-image-1`, and
+    `openai/gpt-image-1-mini` models can still be selected explicitly. Use
+    `gpt-image-1.5` for transparent-background PNG/WebP output; the current
+    `gpt-image-2` API rejects `background: "transparent"`.
+
+    `gpt-image-2` supports both text-to-image generation and
+    reference-image editing through the same `image_generate` tool.
+    OpenClaw forwards `prompt`, `count`, `size`, `quality`, `outputFormat`,
+    and reference images to OpenAI. OpenAI does **not** receive
+    `aspectRatio` or `resolution` directly; when possible OpenClaw maps
+    those into a supported `size`, otherwise the tool reports them as
+    ignored overrides.
+
+    OpenAI-specific options live under the `openai` object:
+
+    ```json
+    {
+      "quality": "low",
+      "outputFormat": "jpeg",
+      "openai": {
+        "background": "opaque",
+        "moderation": "low",
+        "outputCompression": 60,
+        "user": "end-user-42"
+      }
+    }
+    ```
+
+    `openai.background` accepts `transparent`, `opaque`, or `auto`;
+    transparent outputs require `outputFormat` `png` or `webp` and a
+    transparency-capable OpenAI image model. OpenClaw routes default
+    `gpt-image-2` transparent-background requests to `gpt-image-1.5`.
+    `openai.outputCompression` applies to JPEG/WebP outputs.
+
+    The top-level `background` hint is provider-neutral and currently maps
+    to the same OpenAI `background` request field when the OpenAI provider
+    is selected. Providers that do not declare background support return
+    it in `ignoredOverrides` instead of receiving the unsupported parameter.
+
+    To route OpenAI image generation through an Azure OpenAI deployment
+    instead of `api.openai.com`, see
+    [Azure OpenAI endpoints](/providers/openai#azure-openai-endpoints).
+
+  </Accordion>
+  <Accordion title="OpenRouter image models">
+    OpenRouter image generation uses the same `OPENROUTER_API_KEY` and
+    routes through OpenRouter's chat completions image API. Select
+    OpenRouter image models with the `openrouter/` prefix:
+
+    ```json5
+    {
+      agents: {
+        defaults: {
+          imageGenerationModel: {
+            primary: "openrouter/google/gemini-3.1-flash-image-preview",
+          },
+        },
      },
-    },
-  },
-}
+    }
+    ```
+
+    OpenClaw forwards `prompt`, `count`, reference images, and
+    Gemini-compatible `aspectRatio` / `resolution` hints to OpenRouter.
+    Current built-in OpenRouter image model shortcuts include
+    `google/gemini-3.1-flash-image-preview`,
+    `google/gemini-3-pro-image-preview`, and `openai/gpt-5.4-image-2`. Use
+    `action: "list"` to see what your configured plugin exposes.
+
+  </Accordion>
+  <Accordion title="MiniMax dual-auth">
+    MiniMax image generation is available through both bundled MiniMax
+    auth paths:
+
+    - `minimax/image-01` for API-key setups
+    - `minimax-portal/image-01` for OAuth setups
+
+  </Accordion>
+  <Accordion title="xAI grok-imagine-image">
+    The bundled xAI provider uses `/v1/images/generations` for prompt-only
+    requests and `/v1/images/edits` when `image` or `images` is present.
+
+    - Models: `xai/grok-imagine-image`, `xai/grok-imagine-image-pro`
+    - Count: up to 4
+    - References: one `image` or up to five `images`
+    - Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2`
+    - Resolutions: `1K`, `2K`
+    - Outputs: returned as OpenClaw-managed image attachments
+
+    OpenClaw intentionally does not expose xAI-native `quality`, `mask`,
+    `user`, or extra native-only aspect ratios until those controls exist
+    in the shared cross-provider `image_generate` contract.
+
+  </Accordion>
+</AccordionGroup>
+
+## Examples
+
+<Tabs>
+  <Tab title="Generate (4K landscape)">
+```text
+/tool image_generate action=generate model=openai/gpt-image-2 prompt="A clean editorial poster for OpenClaw image generation" size=3840x2160 count=1
+```
+  </Tab>
+  <Tab title="Generate (transparent PNG)">
+```text
+/tool image_generate action=generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png background=transparent
 ```

-OpenClaw forwards `prompt`, `count`, reference images, and Gemini-compatible `aspectRatio` / `resolution` hints to OpenRouter. Current built-in OpenRouter image model shortcuts include `google/gemini-3.1-flash-image-preview`, `google/gemini-3-pro-image-preview`, and `openai/gpt-5.4-image-2`; use `action: "list"` to see what your configured plugin exposes.
-
-### OpenAI `gpt-image-2`
-
-OpenAI image generation defaults to `openai/gpt-image-2`. If an
-`openai-codex` OAuth profile is configured, OpenClaw reuses the same OAuth
-profile used by Codex subscription chat models and sends the image request
-through the Codex Responses backend. Legacy Codex base URLs such as
-`https://chatgpt.com/backend-api` are canonicalized to
-`https://chatgpt.com/backend-api/codex` for image requests. It does not
-silently fall back to `OPENAI_API_KEY` for that request. To force direct OpenAI
-Images API routing, configure `models.providers.openai` explicitly with an API
-key, custom base URL, or Azure endpoint. The `openai/gpt-image-1.5`,
-`openai/gpt-image-1`, and `openai/gpt-image-1-mini` models can still be
-selected explicitly. Use `gpt-image-1.5` for transparent-background PNG/WebP
-output; the current `gpt-image-2` API rejects `background: "transparent"`.
-
-`gpt-image-2` supports both text-to-image generation and reference-image
-editing through the same `image_generate` tool. OpenClaw forwards `prompt`,
-`count`, `size`, `quality`, `outputFormat`, and reference images to OpenAI.
-OpenAI does not receive `aspectRatio` or `resolution` directly; when possible
-OpenClaw maps those into a supported `size`, otherwise the tool reports them as
-ignored overrides.
-
-OpenAI-specific options live under the `openai` object:
-
-```json
-{
-  "quality": "low",
-  "outputFormat": "jpeg",
-  "openai": {
-    "background": "opaque",
-    "moderation": "low",
-    "outputCompression": 60,
-    "user": "end-user-42"
-  }
-}
-```
-
-`openai.background` accepts `transparent`, `opaque`, or `auto`; transparent
-outputs require `outputFormat` `png` or `webp` and a transparency-capable OpenAI
-image model. OpenClaw routes default `gpt-image-2` transparent-background
-requests to `gpt-image-1.5`. `openai.outputCompression` applies to JPEG/WebP
-outputs.
-
-The top-level `background` hint is provider-neutral and currently maps to the
-same OpenAI `background` request field when the OpenAI provider is selected.
-Providers that do not declare background support return it in `ignoredOverrides`
-instead of receiving the unsupported parameter.
-
-When asking an agent for a transparent-background OpenAI image, the expected
-tool call is:
-
-```json
-{
-  "model": "openai/gpt-image-1.5",
-  "prompt": "A simple red circle sticker on a transparent background",
-  "outputFormat": "png",
-  "background": "transparent"
-}
-```
-
-The explicit `openai/gpt-image-1.5` model keeps the request portable across
-tool summaries and harnesses. If the agent instead uses the default
-`openai/gpt-image-2` with `openai.background: "transparent"` on the public
-OpenAI or OpenAI Codex OAuth route, OpenClaw rewrites the provider request to
-`gpt-image-1.5`. Azure and custom OpenAI-compatible endpoints keep their
-configured deployment/model names.
-
-For headless CLI generation, use the equivalent `openclaw infer` flags:
+Equivalent CLI:

 ```bash
 openclaw infer image generate \
@@ -310,86 +370,39 @@ openclaw infer image generate \
  --json
 ```

-The same `--output-format` and `--background` flags are available on
-`openclaw infer image edit`; `--openai-background` remains available as an
-OpenAI-specific alias. Current bundled providers other than OpenAI do not
-declare explicit background control, so `background: "transparent"` is reported
-as ignored for them.
-
-Generate one 4K landscape image:
-
-```
-/tool image_generate action=generate model=openai/gpt-image-2 prompt="A clean editorial poster for OpenClaw image generation" size=3840x2160 count=1
-```
-
-Generate a transparent PNG:
-
-```
-/tool image_generate action=generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png background=transparent
-```
-
-Generate two square images:
-
-```
+  </Tab>
+  <Tab title="Generate (two square)">
+```text
 /tool image_generate action=generate model=openai/gpt-image-2 prompt="Two visual directions for a calm productivity app icon" size=1024x1024 count=2
 ```
-
-Edit one local reference image:
-
-```
+  </Tab>
+  <Tab title="Edit (one reference)">
+```text
 /tool image_generate action=generate model=openai/gpt-image-2 prompt="Keep the subject, replace the background with a bright studio setup" image=/path/to/reference.png size=1024x1536
 ```
-
-Edit with multiple references:
-
-```
+  </Tab>
+  <Tab title="Edit (multiple references)">
+```text
 /tool image_generate action=generate model=openai/gpt-image-2 prompt="Combine the character identity from the first image with the color palette from the second" images='["/path/to/character.png","/path/to/palette.jpg"]' size=1536x1024
 ```
+  </Tab>
+</Tabs>

-To route OpenAI image generation through an Azure OpenAI deployment instead
-of `api.openai.com`, see [Azure OpenAI endpoints](/providers/openai#azure-openai-endpoints)
-in the OpenAI provider docs.
-
-MiniMax image generation is available through both bundled MiniMax auth paths:
-
- `minimax/image-01` for API-key setups
- `minimax-portal/image-01` for OAuth setups
-
-## Provider capabilities
-
-| Capability            | OpenAI               | Google               | fal                 | MiniMax                    | ComfyUI                            | Vydra   | xAI                  |
-| --------------------- | -------------------- | -------------------- | ------------------- | -------------------------- | ---------------------------------- | ------- | -------------------- |
-| Generate              | Yes (up to 4)        | Yes (up to 4)        | Yes (up to 4)       | Yes (up to 9)              | Yes (workflow-defined outputs)     | Yes (1) | Yes (up to 4)        |
-| Edit/reference        | Yes (up to 5 images) | Yes (up to 5 images) | Yes (1 image)       | Yes (1 image, subject ref) | Yes (1 image, workflow-configured) | No      | Yes (up to 5 images) |
-| Size control          | Yes (up to 4K)       | Yes                  | Yes                 | No                         | No                                 | No      | No                   |
-| Aspect ratio          | No                   | Yes                  | Yes (generate only) | Yes                        | No                                 | No      | Yes                  |
-| Resolution (1K/2K/4K) | No                   | Yes                  | Yes                 | No                         | No                                 | No      | Yes (1K/2K)          |
-
-### xAI `grok-imagine-image`
-
-The bundled xAI provider uses `/v1/images/generations` for prompt-only requests
-and `/v1/images/edits` when `image` or `images` is present.
-
- Models: `xai/grok-imagine-image`, `xai/grok-imagine-image-pro`
- Count: up to 4
- References: one `image` or up to five `images`
- Aspect ratios: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `2:3`, `3:2`
- Resolutions: `1K`, `2K`
- Outputs: returned as OpenClaw-managed image attachments
-
-OpenClaw intentionally does not expose xAI-native `quality`, `mask`, `user`, or
-extra native-only aspect ratios until those controls exist in the shared
-cross-provider `image_generate` contract.
+The same `--output-format` and `--background` flags are available on
+`openclaw infer image edit`; `--openai-background` remains as an
+OpenAI-specific alias. Bundled providers other than OpenAI do not declare
+explicit background control today, so `background: "transparent"` is reported
+as ignored for them.

 ## Related

- [Tools Overview](/tools) — all available agent tools
- [fal](/providers/fal) — fal image and video provider setup
+- [Tools overview](/tools) — all available agent tools
 - [ComfyUI](/providers/comfy) — local ComfyUI and Comfy Cloud workflow setup
+- [fal](/providers/fal) — fal image and video provider setup
 - [Google (Gemini)](/providers/google) — Gemini image provider setup
 - [MiniMax](/providers/minimax) — MiniMax image provider setup
 - [OpenAI](/providers/openai) — OpenAI Images provider setup
 - [Vydra](/providers/vydra) — Vydra image, video, and speech setup
 - [xAI](/providers/xai) — Grok image, video, search, code execution, and TTS setup
- [Configuration Reference](/gateway/config-agents#agent-defaults) — `imageGenerationModel` config
+- [Configuration reference](/gateway/config-agents#agent-defaults) — `imageGenerationModel` config
 - [Models](/concepts/models) — model configuration and failover