mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 10:20:42 +00:00
feat(openai): default images to gpt-image-2
This commit is contained in:
@@ -240,7 +240,7 @@ Infer commands normalize JSON output under a shared envelope:
|
||||
"capability": "image.generate",
|
||||
"transport": "local",
|
||||
"provider": "openai",
|
||||
"model": "gpt-image-1",
|
||||
"model": "gpt-image-2",
|
||||
"attempts": [],
|
||||
"outputs": []
|
||||
}
|
||||
|
||||
@@ -173,7 +173,7 @@ Current bundled examples:
|
||||
normalization (`input` / `output` and `prompt` / `completion` families), the
|
||||
shared `openai-responses-defaults` stream family for native OpenAI/Codex
|
||||
wrappers, provider-family metadata, bundled image-generation provider
|
||||
registration for `gpt-image-1`, and bundled video-generation provider
|
||||
registration for `gpt-image-2`, and bundled video-generation provider
|
||||
registration for `sora-2`
|
||||
- `google` and `google-gemini-cli`: Gemini 3.1 forward-compat fallback,
|
||||
native Gemini replay validation, bootstrap replay sanitation, tagged
|
||||
|
||||
@@ -1177,7 +1177,7 @@ Time format in system prompt. Default: `auto` (OS preference).
|
||||
fallbacks: ["openrouter/google/gemini-2.0-flash-vision:free"],
|
||||
},
|
||||
imageGenerationModel: {
|
||||
primary: "openai/gpt-image-1",
|
||||
primary: "openai/gpt-image-2",
|
||||
fallbacks: ["google/gemini-3.1-flash-image-preview"],
|
||||
},
|
||||
videoGenerationModel: {
|
||||
@@ -1215,7 +1215,7 @@ Time format in system prompt. Default: `auto` (OS preference).
|
||||
- Also used as fallback routing when the selected/default model cannot accept image input.
|
||||
- `imageGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
|
||||
- Used by the shared image-generation capability and any future tool/plugin surface that generates images.
|
||||
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, or `openai/gpt-image-1` for OpenAI Images.
|
||||
- Typical values: `google/gemini-3.1-flash-image-preview` for native Gemini image generation, `fal/fal-ai/flux/dev` for fal, or `openai/gpt-image-2` for OpenAI Images.
|
||||
- If you select a provider/model directly, configure the matching provider auth/API key too (for example `GEMINI_API_KEY` or `GOOGLE_API_KEY` for `google/*`, `OPENAI_API_KEY` for `openai/*`, `FAL_KEY` for `fal/*`).
|
||||
- If omitted, `image_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered image-generation providers in provider-id order.
|
||||
- `musicGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`).
|
||||
|
||||
@@ -779,7 +779,7 @@ If you want to rely on env keys (e.g. exported in your `~/.profile`), run local
|
||||
- `google`
|
||||
- Optional narrowing:
|
||||
- `OPENCLAW_LIVE_IMAGE_GENERATION_PROVIDERS="openai,google"`
|
||||
- `OPENCLAW_LIVE_IMAGE_GENERATION_MODELS="openai/gpt-image-1,google/gemini-3.1-flash-image-preview"`
|
||||
- `OPENCLAW_LIVE_IMAGE_GENERATION_MODELS="openai/gpt-image-2,google/gemini-3.1-flash-image-preview"`
|
||||
- `OPENCLAW_LIVE_IMAGE_GENERATION_CASES="google:flash-generate,google:pro-edit"`
|
||||
- Optional auth behavior:
|
||||
- `OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1` to force profile-store auth and ignore env-only overrides
|
||||
|
||||
@@ -158,17 +158,17 @@ The bundled `openai` plugin registers image generation through the `image_genera
|
||||
|
||||
| Capability | Value |
|
||||
| ------------------------- | ---------------------------------- |
|
||||
| Default model | `openai/gpt-image-1` |
|
||||
| Default model | `openai/gpt-image-2` |
|
||||
| Max images per request | 4 |
|
||||
| Edit mode | Enabled (up to 5 reference images) |
|
||||
| Size overrides | Supported |
|
||||
| Size overrides | Supported, including 2K/4K sizes |
|
||||
| Aspect ratio / resolution | Not forwarded to OpenAI Images API |
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
imageGenerationModel: { primary: "openai/gpt-image-1" },
|
||||
imageGenerationModel: { primary: "openai/gpt-image-2" },
|
||||
},
|
||||
},
|
||||
}
|
||||
@@ -178,6 +178,22 @@ The bundled `openai` plugin registers image generation through the `image_genera
|
||||
See [Image Generation](/tools/image-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
`gpt-image-2` is the default for both OpenAI text-to-image generation and image
|
||||
editing. `gpt-image-1` remains usable as an explicit model override, but new
|
||||
OpenAI image workflows should use `openai/gpt-image-2`.
|
||||
|
||||
Generate:
|
||||
|
||||
```
|
||||
/tool image_generate model=openai/gpt-image-2 prompt="A polished launch poster for OpenClaw on macOS" size=3840x2160 count=1
|
||||
```
|
||||
|
||||
Edit:
|
||||
|
||||
```
|
||||
/tool image_generate model=openai/gpt-image-2 prompt="Preserve the object shape, change the material to translucent glass" image=/path/to/reference.png size=1024x1536
|
||||
```
|
||||
|
||||
## Video generation
|
||||
|
||||
The bundled `openai` plugin registers video generation through the `video_generate` tool.
|
||||
|
||||
@@ -25,7 +25,7 @@ The tool only appears when at least one image generation provider is available.
|
||||
agents: {
|
||||
defaults: {
|
||||
imageGenerationModel: {
|
||||
primary: "openai/gpt-image-1",
|
||||
primary: "openai/gpt-image-2",
|
||||
},
|
||||
},
|
||||
},
|
||||
@@ -40,7 +40,7 @@ The agent calls `image_generate` automatically. No tool allow-listing needed —
|
||||
|
||||
| Provider | Default model | Edit support | API key |
|
||||
| -------- | -------------------------------- | ---------------------------------- | ----------------------------------------------------- |
|
||||
| OpenAI | `gpt-image-1` | Yes (up to 5 images) | `OPENAI_API_KEY` |
|
||||
| OpenAI | `gpt-image-2` | Yes (up to 5 images) | `OPENAI_API_KEY` |
|
||||
| Google | `gemini-3.1-flash-image-preview` | Yes | `GEMINI_API_KEY` or `GOOGLE_API_KEY` |
|
||||
| fal | `fal-ai/flux/dev` | Yes | `FAL_KEY` |
|
||||
| MiniMax | `image-01` | Yes (subject reference) | `MINIMAX_API_KEY` or MiniMax OAuth (`minimax-portal`) |
|
||||
@@ -59,10 +59,10 @@ Use `action: "list"` to inspect available providers and models at runtime:
|
||||
| ------------- | -------- | ------------------------------------------------------------------------------------- |
|
||||
| `prompt` | string | Image generation prompt (required for `action: "generate"`) |
|
||||
| `action` | string | `"generate"` (default) or `"list"` to inspect providers |
|
||||
| `model` | string | Provider/model override, e.g. `openai/gpt-image-1` |
|
||||
| `model` | string | Provider/model override, e.g. `openai/gpt-image-2` |
|
||||
| `image` | string | Single reference image path or URL for edit mode |
|
||||
| `images` | string[] | Multiple reference images for edit mode (up to 5) |
|
||||
| `size` | string | Size hint: `1024x1024`, `1536x1024`, `1024x1536`, `1024x1792`, `1792x1024` |
|
||||
| `size` | string | Size hint: `1024x1024`, `1536x1024`, `1024x1536`, `2048x2048`, `3840x2160` |
|
||||
| `aspectRatio` | string | Aspect ratio: `1:1`, `2:3`, `3:2`, `3:4`, `4:3`, `4:5`, `5:4`, `9:16`, `16:9`, `21:9` |
|
||||
| `resolution` | string | Resolution hint: `1K`, `2K`, or `4K` |
|
||||
| `count` | number | Number of images to generate (1–4) |
|
||||
@@ -81,7 +81,7 @@ Tool results report the applied settings. When OpenClaw remaps geometry during p
|
||||
agents: {
|
||||
defaults: {
|
||||
imageGenerationModel: {
|
||||
primary: "openai/gpt-image-1",
|
||||
primary: "openai/gpt-image-2",
|
||||
fallbacks: ["google/gemini-3.1-flash-image-preview", "fal/fal-ai/flux/dev"],
|
||||
},
|
||||
},
|
||||
@@ -123,6 +123,42 @@ OpenAI, Google, fal, MiniMax, and ComfyUI support editing reference images. Pass
|
||||
|
||||
OpenAI and Google support up to 5 reference images via the `images` parameter. fal, MiniMax, and ComfyUI support 1.
|
||||
|
||||
### OpenAI `gpt-image-2`
|
||||
|
||||
OpenAI image generation defaults to `openai/gpt-image-2`. The older
|
||||
`openai/gpt-image-1` model can still be selected explicitly, but new OpenAI
|
||||
image-generation and image-editing requests should use `gpt-image-2`.
|
||||
|
||||
`gpt-image-2` supports both text-to-image generation and reference-image
|
||||
editing through the same `image_generate` tool. OpenClaw forwards `prompt`,
|
||||
`count`, `size`, and reference images to OpenAI. OpenAI does not receive
|
||||
`aspectRatio` or `resolution` directly; when possible OpenClaw maps those into a
|
||||
supported `size`, otherwise the tool reports them as ignored overrides.
|
||||
|
||||
Generate one 4K landscape image:
|
||||
|
||||
```
|
||||
/tool image_generate action=generate model=openai/gpt-image-2 prompt="A clean editorial poster for OpenClaw image generation" size=3840x2160 count=1
|
||||
```
|
||||
|
||||
Generate two square images:
|
||||
|
||||
```
|
||||
/tool image_generate action=generate model=openai/gpt-image-2 prompt="Two visual directions for a calm productivity app icon" size=1024x1024 count=2
|
||||
```
|
||||
|
||||
Edit one local reference image:
|
||||
|
||||
```
|
||||
/tool image_generate action=generate model=openai/gpt-image-2 prompt="Keep the subject, replace the background with a bright studio setup" image=/path/to/reference.png size=1024x1536
|
||||
```
|
||||
|
||||
Edit with multiple references:
|
||||
|
||||
```
|
||||
/tool image_generate action=generate model=openai/gpt-image-2 prompt="Combine the character identity from the first image with the color palette from the second" images='["/path/to/character.png","/path/to/palette.jpg"]' size=1536x1024
|
||||
```
|
||||
|
||||
MiniMax image generation is available through both bundled MiniMax auth paths:
|
||||
|
||||
- `minimax/image-01` for API-key setups
|
||||
@@ -134,7 +170,7 @@ MiniMax image generation is available through both bundled MiniMax auth paths:
|
||||
| --------------------- | -------------------- | -------------------- | ------------------- | -------------------------- | ---------------------------------- | ------- |
|
||||
| Generate | Yes (up to 4) | Yes (up to 4) | Yes (up to 4) | Yes (up to 9) | Yes (workflow-defined outputs) | Yes (1) |
|
||||
| Edit/reference | Yes (up to 5 images) | Yes (up to 5 images) | Yes (1 image) | Yes (1 image, subject ref) | Yes (1 image, workflow-configured) | No |
|
||||
| Size control | Yes | Yes | Yes | No | No | No |
|
||||
| Size control | Yes (up to 4K) | Yes | Yes | No | No | No |
|
||||
| Aspect ratio | No | Yes | Yes (generate only) | Yes | No | No |
|
||||
| Resolution (1K/2K/4K) | No | Yes | Yes | No | No | No |
|
||||
|
||||
|
||||
Reference in New Issue
Block a user