diff --git a/docs/concepts/models.md b/docs/concepts/models.md index ae8370b4f06..7c198b29f1a 100644 --- a/docs/concepts/models.md +++ b/docs/concepts/models.md @@ -30,7 +30,7 @@ Related: falls back to `agents.defaults.imageModel`, then the resolved session/default model. - `agents.defaults.imageGenerationModel` is used by the shared image-generation capability. If omitted, `image_generate` can still infer an auth-backed provider default. It tries the current default provider first, then the remaining registered image-generation providers in provider-id order. If you set a specific provider/model, also configure that provider's auth/API key. -- `agents.defaults.videoGenerationModel` is used by the shared video-generation capability. If omitted, video-generation providers can still use their own default model selection; if you set a specific provider/model, configure that provider's auth/API key too. +- `agents.defaults.videoGenerationModel` is used by the shared video-generation capability. Unlike image generation, this does not infer a provider default today. Set an explicit `provider/model` such as `qwen/wan2.6-t2v`, and configure that provider's auth/API key too. - Per-agent defaults can override `agents.defaults.model` via `agents.list[].model` plus bindings (see [/concepts/multi-agent](/concepts/multi-agent)). ## Quick model policy diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md index 8d12581b9a5..9dc87f01a05 100644 --- a/docs/gateway/configuration-reference.md +++ b/docs/gateway/configuration-reference.md @@ -989,6 +989,7 @@ Time format in system prompt. Default: `auto` (OS preference). - `videoGenerationModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`). - Used by the shared video-generation capability. - Typical values: `qwen/wan2.6-t2v`, `qwen/wan2.6-i2v`, `qwen/wan2.6-r2v`, `qwen/wan2.6-r2v-flash`, or `qwen/wan2.7-r2v`. + - Set this explicitly before using shared video generation. Unlike `imageGenerationModel`, the video-generation runtime does not infer a provider default yet. - If you select a provider/model directly, configure the matching provider auth/API key too. - The bundled Qwen video-generation provider currently supports up to 1 output video, 1 input image, 4 input videos, 10 seconds duration, and provider-level `size`, `aspectRatio`, `resolution`, `audio`, and `watermark` options. - `pdfModel`: accepts either a string (`"provider/model"`) or an object (`{ primary, fallbacks }`). diff --git a/docs/reference/api-usage-costs.md b/docs/reference/api-usage-costs.md index 89135e4329e..a77f89760e1 100644 --- a/docs/reference/api-usage-costs.md +++ b/docs/reference/api-usage-costs.md @@ -89,7 +89,22 @@ Inbound media can be summarized/transcribed before the reply runs. This uses mod See [Media understanding](/nodes/media-understanding). -### 3) Memory embeddings + semantic search +### 3) Image and video generation + +Shared generation capabilities can also spend provider keys: + +- Image generation: OpenAI / Google / fal / MiniMax +- Video generation: Qwen + +Image generation can infer an auth-backed provider default when +`agents.defaults.imageGenerationModel` is unset. Video generation currently +requires an explicit `agents.defaults.videoGenerationModel` such as +`qwen/wan2.6-t2v`. + +See [Image generation](/tools/image-generation), [Qwen Cloud](/providers/qwen), +and [Models](/concepts/models). + +### 4) Memory embeddings + semantic search Semantic memory search uses **embedding APIs** when configured for remote providers: @@ -104,7 +119,7 @@ You can keep it local with `memorySearch.provider = "local"` (no API usage). See [Memory](/concepts/memory). -### 4) Web search tool +### 5) Web search tool `web_search` may incur usage charges depending on your provider: