mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 17:31:06 +00:00
fix: support transparent OpenAI image generation
This commit is contained in:
@@ -342,8 +342,8 @@ Time format in system prompt. Default: `auto` (OS preference).
|
||||
- Also used as fallback routing when the selected/default model cannot accept image input.
|
||||
- `imageGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
|
||||
- Used by the shared image-generation capability and any future tool/plugin surface that generates images.
|
||||
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, or `openai/gpt-image-2` for OpenAI Images.
|
||||
- If you select a provider/model directly, configure matching provider auth too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` or OpenAI Codex OAuth for `openai/gpt-image-2`, `FAL_KEY` for `fal/*`).
|
||||
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, `openai/gpt-image-2` for OpenAI Images, or `openai/gpt-image-1.5` for transparent-background OpenAI PNG/WebP output.
|
||||
- If you select a provider/model directly, configure matching provider auth too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` or OpenAI Codex OAuth for `openai/gpt-image-2` / `openai/gpt-image-1.5`, `FAL_KEY` for `fal/*`).
|
||||
- If omitted, `image_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered image-generation providers in provider-id order.
|
||||
- `musicGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
|
||||
- Used by the shared music-generation capability and the built-in `music_generate` tool.
|
||||
|
||||
@@ -27,6 +27,7 @@ changing config.
|
||||
| GPT-5.5 with ChatGPT/Codex subscription auth | `openai-codex/gpt-5.5` | Default PI route for Codex OAuth. Best first choice for subscription setups. |
|
||||
| GPT-5.5 with native Codex app-server behavior | `openai/gpt-5.5` plus `embeddedHarness.runtime: "codex"` | Forces the Codex app-server harness for that model ref. |
|
||||
| Image generation or editing | `openai/gpt-image-2` | Works with either `OPENAI_API_KEY` or OpenAI Codex OAuth. |
|
||||
| Transparent-background images | `openai/gpt-image-1.5` | Use `outputFormat=png` or `webp` and `openai.background=transparent`. |
|
||||
|
||||
<Note>
|
||||
GPT-5.5 is available through both direct OpenAI Platform API-key access and
|
||||
@@ -254,8 +255,17 @@ See [Image Generation](/tools/image-generation) for shared tool parameters, prov
|
||||
</Note>
|
||||
|
||||
`gpt-image-2` is the default for both OpenAI text-to-image generation and image
|
||||
editing. `gpt-image-1` remains usable as an explicit model override, but new
|
||||
OpenAI image workflows should use `openai/gpt-image-2`.
|
||||
editing. `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini` remain usable as
|
||||
explicit model overrides. Use `openai/gpt-image-1.5` for transparent-background
|
||||
PNG/WebP output; the current `gpt-image-2` API rejects
|
||||
`background: "transparent"`.
|
||||
|
||||
For a transparent-background request, agents should call `image_generate` with
|
||||
`model: "openai/gpt-image-1.5"`, `outputFormat: "png"` or `"webp"`, and
|
||||
`openai.background: "transparent"`. OpenClaw also protects the public OpenAI and
|
||||
OpenAI Codex OAuth routes by rewriting default `openai/gpt-image-2` transparent
|
||||
requests to `gpt-image-1.5`; Azure and custom OpenAI-compatible endpoints keep
|
||||
their configured deployment/model names.
|
||||
|
||||
For Codex OAuth installs, keep the same `openai/gpt-image-2` ref. When an
|
||||
`openai-codex` OAuth profile is configured, OpenClaw resolves that stored OAuth
|
||||
@@ -275,6 +285,12 @@ Generate:
|
||||
/tool image_generate model=openai/gpt-image-2 prompt="A polished launch poster for OpenClaw on macOS" size=3840x2160 count=1
|
||||
```
|
||||
|
||||
Generate a transparent PNG:
|
||||
|
||||
```
|
||||
/tool image_generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png openai='{"background":"transparent"}'
|
||||
```
|
||||
|
||||
Edit:
|
||||
|
||||
```
|
||||
|
||||
@@ -48,13 +48,14 @@ The agent calls `image_generate` automatically. No tool allow-listing needed —
|
||||
|
||||
## Common routes
|
||||
|
||||
| Goal | Model ref | Auth |
|
||||
| ---------------------------------------------------- | -------------------------------------------------- | ------------------------------------ |
|
||||
| OpenAI image generation with API billing | `openai/gpt-image-2` | `OPENAI_API_KEY` |
|
||||
| OpenAI image generation with Codex subscription auth | `openai/gpt-image-2` | OpenAI Codex OAuth |
|
||||
| OpenRouter image generation | `openrouter/google/gemini-3.1-flash-image-preview` | `OPENROUTER_API_KEY` |
|
||||
| LiteLLM image generation | `litellm/gpt-image-2` | `LITELLM_API_KEY` |
|
||||
| Google Gemini image generation | `google/gemini-3.1-flash-image-preview` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
|
||||
| Goal | Model ref | Auth |
|
||||
| ---------------------------------------------------- | -------------------------------------------------- | -------------------------------------- |
|
||||
| OpenAI image generation with API billing | `openai/gpt-image-2` | `OPENAI_API_KEY` |
|
||||
| OpenAI image generation with Codex subscription auth | `openai/gpt-image-2` | OpenAI Codex OAuth |
|
||||
| OpenAI transparent-background PNG/WebP | `openai/gpt-image-1.5` | `OPENAI_API_KEY` or OpenAI Codex OAuth |
|
||||
| OpenRouter image generation | `openrouter/google/gemini-3.1-flash-image-preview` | `OPENROUTER_API_KEY` |
|
||||
| LiteLLM image generation | `litellm/gpt-image-2` | `LITELLM_API_KEY` |
|
||||
| Google Gemini image generation | `google/gemini-3.1-flash-image-preview` | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
|
||||
|
||||
The same `image_generate` tool handles text-to-image and reference-image
|
||||
editing. Use `image` for one reference or `images` for multiple references.
|
||||
@@ -93,7 +94,8 @@ Use `"list"` to inspect available providers and models at runtime.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="model" type="string">
|
||||
Provider/model override, e.g. `openai/gpt-image-2`.
|
||||
Provider/model override, e.g. `openai/gpt-image-2`; use
|
||||
`openai/gpt-image-1.5` for transparent OpenAI backgrounds.
|
||||
</ParamField>
|
||||
|
||||
<ParamField path="image" type="string">
|
||||
@@ -233,9 +235,10 @@ through the Codex Responses backend. Legacy Codex base URLs such as
|
||||
`https://chatgpt.com/backend-api/codex` for image requests. It does not
|
||||
silently fall back to `OPENAI_API_KEY` for that request. To force direct OpenAI
|
||||
Images API routing, configure `models.providers.openai` explicitly with an API
|
||||
key, custom base URL, or Azure endpoint. The older
|
||||
`openai/gpt-image-1` model can still be selected explicitly, but new OpenAI
|
||||
image-generation and image-editing requests should use `gpt-image-2`.
|
||||
key, custom base URL, or Azure endpoint. The `openai/gpt-image-1.5`,
|
||||
`openai/gpt-image-1`, and `openai/gpt-image-1-mini` models can still be
|
||||
selected explicitly. Use `gpt-image-1.5` for transparent-background PNG/WebP
|
||||
output; the current `gpt-image-2` API rejects `background: "transparent"`.
|
||||
|
||||
`gpt-image-2` supports both text-to-image generation and reference-image
|
||||
editing through the same `image_generate` tool. OpenClaw forwards `prompt`,
|
||||
@@ -260,8 +263,31 @@ OpenAI-specific options live under the `openai` object:
|
||||
```
|
||||
|
||||
`openai.background` accepts `transparent`, `opaque`, or `auto`; transparent
|
||||
outputs require `outputFormat` `png` or `webp`. `openai.outputCompression`
|
||||
applies to JPEG/WebP outputs.
|
||||
outputs require `outputFormat` `png` or `webp` and a transparency-capable OpenAI
|
||||
image model. OpenClaw routes default `gpt-image-2` transparent-background
|
||||
requests to `gpt-image-1.5`. `openai.outputCompression` applies to JPEG/WebP
|
||||
outputs.
|
||||
|
||||
When asking an agent for a transparent-background OpenAI image, the expected
|
||||
tool call is:
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "openai/gpt-image-1.5",
|
||||
"prompt": "A simple red circle sticker on a transparent background",
|
||||
"outputFormat": "png",
|
||||
"openai": {
|
||||
"background": "transparent"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The explicit `openai/gpt-image-1.5` model keeps the request portable across
|
||||
tool summaries and harnesses. If the agent instead uses the default
|
||||
`openai/gpt-image-2` with `openai.background: "transparent"` on the public
|
||||
OpenAI or OpenAI Codex OAuth route, OpenClaw rewrites the provider request to
|
||||
`gpt-image-1.5`. Azure and custom OpenAI-compatible endpoints keep their
|
||||
configured deployment/model names.
|
||||
|
||||
Generate one 4K landscape image:
|
||||
|
||||
@@ -269,6 +295,12 @@ Generate one 4K landscape image:
|
||||
/tool image_generate action=generate model=openai/gpt-image-2 prompt="A clean editorial poster for OpenClaw image generation" size=3840x2160 count=1
|
||||
```
|
||||
|
||||
Generate a transparent PNG:
|
||||
|
||||
```
|
||||
/tool image_generate action=generate model=openai/gpt-image-1.5 prompt="A simple red circle sticker on a transparent background" outputFormat=png openai='{"background":"transparent"}'
|
||||
```
|
||||
|
||||
Generate two square images:
|
||||
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user