diff --git a/docs/providers/bedrock.md b/docs/providers/bedrock.md index caf29e47c65..914bf8bfb2a 100644 --- a/docs/providers/bedrock.md +++ b/docs/providers/bedrock.md @@ -8,16 +8,130 @@ title: "Amazon Bedrock" # Amazon Bedrock -OpenClaw can use **Amazon Bedrock** models via pi‑ai’s **Bedrock Converse** +OpenClaw can use **Amazon Bedrock** models via pi-ai's **Bedrock Converse** streaming provider. Bedrock auth uses the **AWS SDK default credential chain**, not an API key. -## What pi-ai supports +| Property | Value | +| -------- | ----------------------------------------------------------- | +| Provider | `amazon-bedrock` | +| API | `bedrock-converse-stream` | +| Auth | AWS credentials (env vars, shared config, or instance role) | +| Region | `AWS_REGION` or `AWS_DEFAULT_REGION` (default: `us-east-1`) | -- Provider: `amazon-bedrock` -- API: `bedrock-converse-stream` -- Auth: AWS credentials (env vars, shared config, or instance role) -- Region: `AWS_REGION` or `AWS_DEFAULT_REGION` (default: `us-east-1`) +## Getting started + +Choose your preferred auth method and follow the setup steps. + + + + **Best for:** developer machines, CI, or hosts where you manage AWS credentials directly. + + + + ```bash + export AWS_ACCESS_KEY_ID="AKIA..." + export AWS_SECRET_ACCESS_KEY="..." + export AWS_REGION="us-east-1" + # Optional: + export AWS_SESSION_TOKEN="..." + export AWS_PROFILE="your-profile" + # Optional (Bedrock API key/bearer token): + export AWS_BEARER_TOKEN_BEDROCK="..." + ``` + + + No `apiKey` is required. Configure the provider with `auth: "aws-sdk"`: + + ```json5 + { + models: { + providers: { + "amazon-bedrock": { + baseUrl: "https://bedrock-runtime.us-east-1.amazonaws.com", + api: "bedrock-converse-stream", + auth: "aws-sdk", + models: [ + { + id: "us.anthropic.claude-opus-4-6-v1:0", + name: "Claude Opus 4.6 (Bedrock)", + reasoning: true, + input: ["text", "image"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 200000, + maxTokens: 8192, + }, + ], + }, + }, + }, + agents: { + defaults: { + model: { primary: "amazon-bedrock/us.anthropic.claude-opus-4-6-v1:0" }, + }, + }, + } + ``` + + + ```bash + openclaw models list + ``` + + + + + With env-marker auth (`AWS_ACCESS_KEY_ID`, `AWS_PROFILE`, or `AWS_BEARER_TOKEN_BEDROCK`), OpenClaw auto-enables the implicit Bedrock provider for model discovery without extra config. + + + + + + **Best for:** EC2 instances with an IAM role attached, using the instance metadata service for authentication. + + + + When using IMDS, OpenClaw cannot detect AWS auth from env markers alone, so you must opt in: + + ```bash + openclaw config set plugins.entries.amazon-bedrock.config.discovery.enabled true + openclaw config set plugins.entries.amazon-bedrock.config.discovery.region us-east-1 + ``` + + + If you also want the env-marker auto-detection path to work (for example, for `openclaw status` surfaces): + + ```bash + export AWS_PROFILE=default + export AWS_REGION=us-east-1 + ``` + + You do **not** need a fake API key. + + + ```bash + openclaw models list + ``` + + + + + The IAM role attached to your EC2 instance must have the following permissions: + + - `bedrock:InvokeModel` + - `bedrock:InvokeModelWithResponseStream` + - `bedrock:ListFoundationModels` (for automatic discovery) + - `bedrock:ListInferenceProfiles` (for inference profile discovery) + + Or attach the managed policy `AmazonBedrockFullAccess`. + + + + You only need `AWS_PROFILE=default` if you specifically want an env marker for auto mode or status surfaces. The actual Bedrock runtime auth path uses the AWS SDK default chain, so IMDS instance-role auth works even without env markers. + + + + ## Automatic model discovery @@ -38,127 +152,52 @@ How the implicit provider is enabled: shared config, SSO, and IMDS instance-role auth can work even when discovery needed `enabled: true` to opt in. -Config options live under `plugins.entries.amazon-bedrock.config.discovery`: + +For explicit `models.providers["amazon-bedrock"]` entries, OpenClaw can still resolve Bedrock env-marker auth early from AWS env markers such as `AWS_BEARER_TOKEN_BEDROCK` without forcing full runtime auth loading. The actual model-call auth path still uses the AWS SDK default chain. + -```json5 -{ - plugins: { - entries: { - "amazon-bedrock": { - config: { - discovery: { - enabled: true, - region: "us-east-1", - providerFilter: ["anthropic", "amazon"], - refreshInterval: 3600, - defaultContextWindow: 32000, - defaultMaxTokens: 4096, + + + Config options live under `plugins.entries.amazon-bedrock.config.discovery`: + + ```json5 + { + plugins: { + entries: { + "amazon-bedrock": { + config: { + discovery: { + enabled: true, + region: "us-east-1", + providerFilter: ["anthropic", "amazon"], + refreshInterval: 3600, + defaultContextWindow: 32000, + defaultMaxTokens: 4096, + }, + }, }, }, }, - }, - }, -} -``` + } + ``` -Notes: + | Option | Default | Description | + | ------ | ------- | ----------- | + | `enabled` | auto | In auto mode, OpenClaw only enables the implicit Bedrock provider when it sees a supported AWS env marker. Set `true` to force discovery. | + | `region` | `AWS_REGION` / `AWS_DEFAULT_REGION` / `us-east-1` | AWS region used for discovery API calls. | + | `providerFilter` | (all) | Matches Bedrock provider names (for example `anthropic`, `amazon`). | + | `refreshInterval` | `3600` | Cache duration in seconds. Set to `0` to disable caching. | + | `defaultContextWindow` | `32000` | Context window used for discovered models (override if you know your model limits). | + | `defaultMaxTokens` | `4096` | Max output tokens used for discovered models (override if you know your model limits). | -- `enabled` defaults to auto mode. In auto mode, OpenClaw only enables the - implicit Bedrock provider when it sees a supported AWS env marker. -- `region` defaults to `AWS_REGION` or `AWS_DEFAULT_REGION`, then `us-east-1`. -- `providerFilter` matches Bedrock provider names (for example `anthropic`). -- `refreshInterval` is seconds; set to `0` to disable caching. -- `defaultContextWindow` (default: `32000`) and `defaultMaxTokens` (default: `4096`) - are used for discovered models (override if you know your model limits). -- For explicit `models.providers["amazon-bedrock"]` entries, OpenClaw can still - resolve Bedrock env-marker auth early from AWS env markers such as - `AWS_BEARER_TOKEN_BEDROCK` without forcing full runtime auth loading. The - actual model-call auth path still uses the AWS SDK default chain. - -## Onboarding - -1. Ensure AWS credentials are available on the **gateway host**: - -```bash -export AWS_ACCESS_KEY_ID="AKIA..." -export AWS_SECRET_ACCESS_KEY="..." -export AWS_REGION="us-east-1" -# Optional: -export AWS_SESSION_TOKEN="..." -export AWS_PROFILE="your-profile" -# Optional (Bedrock API key/bearer token): -export AWS_BEARER_TOKEN_BEDROCK="..." -``` - -2. Add a Bedrock provider and model to your config (no `apiKey` required): - -```json5 -{ - models: { - providers: { - "amazon-bedrock": { - baseUrl: "https://bedrock-runtime.us-east-1.amazonaws.com", - api: "bedrock-converse-stream", - auth: "aws-sdk", - models: [ - { - id: "us.anthropic.claude-opus-4-6-v1:0", - name: "Claude Opus 4.6 (Bedrock)", - reasoning: true, - input: ["text", "image"], - cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, - contextWindow: 200000, - maxTokens: 8192, - }, - ], - }, - }, - }, - agents: { - defaults: { - model: { primary: "amazon-bedrock/us.anthropic.claude-opus-4-6-v1:0" }, - }, - }, -} -``` - -## EC2 Instance Roles - -When running OpenClaw on an EC2 instance with an IAM role attached, the AWS SDK -can use the instance metadata service (IMDS) for authentication. For Bedrock -model discovery, OpenClaw only auto-enables the implicit provider from AWS env -markers unless you explicitly set -`plugins.entries.amazon-bedrock.config.discovery.enabled: true`. - -Recommended setup for IMDS-backed hosts: - -- Set `plugins.entries.amazon-bedrock.config.discovery.enabled` to `true`. -- Set `plugins.entries.amazon-bedrock.config.discovery.region` (or export `AWS_REGION`). -- You do **not** need a fake API key. -- You only need `AWS_PROFILE=default` if you specifically want an env marker - for auto mode or status surfaces. - -```bash -# Recommended: explicit discovery enable + region -openclaw config set plugins.entries.amazon-bedrock.config.discovery.enabled true -openclaw config set plugins.entries.amazon-bedrock.config.discovery.region us-east-1 - -# Optional: add an env marker if you want auto mode without explicit enable -export AWS_PROFILE=default -export AWS_REGION=us-east-1 -``` - -**Required IAM permissions** for the EC2 instance role: - -- `bedrock:InvokeModel` -- `bedrock:InvokeModelWithResponseStream` -- `bedrock:ListFoundationModels` (for automatic discovery) -- `bedrock:ListInferenceProfiles` (for inference profile discovery) - -Or attach the managed policy `AmazonBedrockFullAccess`. + + ## Quick setup (AWS path) +This walkthrough creates an IAM role, attaches Bedrock permissions, associates +the instance profile, and enables OpenClaw discovery on the EC2 host. + ```bash # 1. Create IAM role and instance profile aws iam create-role --role-name EC2-Bedrock-Access \ @@ -197,106 +236,127 @@ source ~/.bashrc openclaw models list ``` -## Inference profiles +## Advanced configuration -OpenClaw discovers **regional and global inference profiles** alongside -foundation models. When a profile maps to a known foundation model, the -profile inherits that model's capabilities (context window, max tokens, -reasoning, vision) and the correct Bedrock request region is injected -automatically. This means cross-region Claude profiles work without manual -provider overrides. + + + OpenClaw discovers **regional and global inference profiles** alongside + foundation models. When a profile maps to a known foundation model, the + profile inherits that model's capabilities (context window, max tokens, + reasoning, vision) and the correct Bedrock request region is injected + automatically. This means cross-region Claude profiles work without manual + provider overrides. -Inference profile IDs look like `us.anthropic.claude-opus-4-6-v1:0` (regional) -or `anthropic.claude-opus-4-6-v1:0` (global). If the backing model is already -in the discovery results, the profile inherits its full capability set; -otherwise safe defaults apply. + Inference profile IDs look like `us.anthropic.claude-opus-4-6-v1:0` (regional) + or `anthropic.claude-opus-4-6-v1:0` (global). If the backing model is already + in the discovery results, the profile inherits its full capability set; + otherwise safe defaults apply. -No extra configuration is needed. As long as discovery is enabled and the IAM -principal has `bedrock:ListInferenceProfiles`, profiles appear alongside -foundation models in `openclaw models list`. + No extra configuration is needed. As long as discovery is enabled and the IAM + principal has `bedrock:ListInferenceProfiles`, profiles appear alongside + foundation models in `openclaw models list`. -## Notes + -- Bedrock requires **model access** enabled in your AWS account/region. -- Automatic discovery needs the `bedrock:ListFoundationModels` and - `bedrock:ListInferenceProfiles` permissions. -- If you rely on auto mode, set one of the supported AWS auth env markers on the - gateway host. If you prefer IMDS/shared-config auth without env markers, set - `plugins.entries.amazon-bedrock.config.discovery.enabled: true`. -- OpenClaw surfaces the credential source in this order: `AWS_BEARER_TOKEN_BEDROCK`, - then `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`, then `AWS_PROFILE`, then the - default AWS SDK chain. -- Reasoning support depends on the model; check the Bedrock model card for - current capabilities. -- If you prefer a managed key flow, you can also place an OpenAI‑compatible - proxy in front of Bedrock and configure it as an OpenAI provider instead. + + You can apply [Amazon Bedrock Guardrails](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) + to all Bedrock model invocations by adding a `guardrail` object to the + `amazon-bedrock` plugin config. Guardrails let you enforce content filtering, + topic denial, word filters, sensitive information filters, and contextual + grounding checks. -## Guardrails - -You can apply [Amazon Bedrock Guardrails](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html) -to all Bedrock model invocations by adding a `guardrail` object to the -`amazon-bedrock` plugin config. Guardrails let you enforce content filtering, -topic denial, word filters, sensitive information filters, and contextual -grounding checks. - -```json5 -{ - plugins: { - entries: { - "amazon-bedrock": { - config: { - guardrail: { - guardrailIdentifier: "abc123", // guardrail ID or full ARN - guardrailVersion: "1", // version number or "DRAFT" - streamProcessingMode: "sync", // optional: "sync" or "async" - trace: "enabled", // optional: "enabled", "disabled", or "enabled_full" + ```json5 + { + plugins: { + entries: { + "amazon-bedrock": { + config: { + guardrail: { + guardrailIdentifier: "abc123", // guardrail ID or full ARN + guardrailVersion: "1", // version number or "DRAFT" + streamProcessingMode: "sync", // optional: "sync" or "async" + trace: "enabled", // optional: "enabled", "disabled", or "enabled_full" + }, + }, }, }, }, - }, - }, -} -``` + } + ``` -- `guardrailIdentifier` (required) accepts a guardrail ID (e.g. `abc123`) or a - full ARN (e.g. `arn:aws:bedrock:us-east-1:123456789012:guardrail/abc123`). -- `guardrailVersion` (required) specifies which published version to use, or - `"DRAFT"` for the working draft. -- `streamProcessingMode` (optional) controls whether guardrail evaluation runs - synchronously (`"sync"`) or asynchronously (`"async"`) during streaming. If - omitted, Bedrock uses its default behavior. -- `trace` (optional) enables guardrail trace output in the API response. Set to - `"enabled"` or `"enabled_full"` for debugging; omit or set `"disabled"` for - production. + | Option | Required | Description | + | ------ | -------- | ----------- | + | `guardrailIdentifier` | Yes | Guardrail ID (e.g. `abc123`) or full ARN (e.g. `arn:aws:bedrock:us-east-1:123456789012:guardrail/abc123`). | + | `guardrailVersion` | Yes | Published version number, or `"DRAFT"` for the working draft. | + | `streamProcessingMode` | No | `"sync"` or `"async"` for guardrail evaluation during streaming. If omitted, Bedrock uses its default. | + | `trace` | No | `"enabled"` or `"enabled_full"` for debugging; omit or set `"disabled"` for production. | -The IAM principal used by the gateway must have the `bedrock:ApplyGuardrail` -permission in addition to the standard invoke permissions. + + The IAM principal used by the gateway must have the `bedrock:ApplyGuardrail` permission in addition to the standard invoke permissions. + -## Embeddings for memory search + -Bedrock can also serve as the embedding provider for -[memory search](/concepts/memory-search). This is configured separately from the -inference provider — set `agents.defaults.memorySearch.provider` to `"bedrock"`: + + Bedrock can also serve as the embedding provider for + [memory search](/concepts/memory-search). This is configured separately from the + inference provider -- set `agents.defaults.memorySearch.provider` to `"bedrock"`: -```json5 -{ - agents: { - defaults: { - memorySearch: { - provider: "bedrock", - model: "amazon.titan-embed-text-v2:0", // default + ```json5 + { + agents: { + defaults: { + memorySearch: { + provider: "bedrock", + model: "amazon.titan-embed-text-v2:0", // default + }, + }, }, - }, - }, -} -``` + } + ``` -Bedrock embeddings use the same AWS SDK credential chain as inference (instance -roles, SSO, access keys, shared config, and web identity). No API key is -needed. When `provider` is `"auto"`, Bedrock is auto-detected if that -credential chain resolves successfully. + Bedrock embeddings use the same AWS SDK credential chain as inference (instance + roles, SSO, access keys, shared config, and web identity). No API key is + needed. When `provider` is `"auto"`, Bedrock is auto-detected if that + credential chain resolves successfully. -Supported embedding models include Amazon Titan Embed (v1, v2), Amazon Nova -Embed, Cohere Embed (v3, v4), and TwelveLabs Marengo. See -[Memory configuration reference — Bedrock](/reference/memory-config#bedrock-embedding-config) -for the full model list and dimension options. + Supported embedding models include Amazon Titan Embed (v1, v2), Amazon Nova + Embed, Cohere Embed (v3, v4), and TwelveLabs Marengo. See + [Memory configuration reference -- Bedrock](/reference/memory-config#bedrock-embedding-config) + for the full model list and dimension options. + + + + + - Bedrock requires **model access** enabled in your AWS account/region. + - Automatic discovery needs the `bedrock:ListFoundationModels` and + `bedrock:ListInferenceProfiles` permissions. + - If you rely on auto mode, set one of the supported AWS auth env markers on the + gateway host. If you prefer IMDS/shared-config auth without env markers, set + `plugins.entries.amazon-bedrock.config.discovery.enabled: true`. + - OpenClaw surfaces the credential source in this order: `AWS_BEARER_TOKEN_BEDROCK`, + then `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`, then `AWS_PROFILE`, then the + default AWS SDK chain. + - Reasoning support depends on the model; check the Bedrock model card for + current capabilities. + - If you prefer a managed key flow, you can also place an OpenAI-compatible + proxy in front of Bedrock and configure it as an OpenAI provider instead. + + + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Bedrock embeddings for memory search configuration. + + + Full Bedrock embedding model list and dimension options. + + + General troubleshooting and FAQ. + + diff --git a/docs/providers/google.md b/docs/providers/google.md index 666fd5ae511..70ee5d16693 100644 --- a/docs/providers/google.md +++ b/docs/providers/google.md @@ -17,74 +17,114 @@ Gemini Grounding. - API: Google Gemini API - Alternative provider: `google-gemini-cli` (OAuth) -## Quick start +## Getting started -1. Set the API key: +Choose your preferred auth method and follow the setup steps. -```bash -openclaw onboard --auth-choice gemini-api-key -``` + + + **Best for:** standard Gemini API access through Google AI Studio. -2. Set a default model: + + + ```bash + openclaw onboard --auth-choice gemini-api-key + ``` -```json5 -{ - agents: { - defaults: { - model: { primary: "google/gemini-3.1-pro-preview" }, - }, - }, -} -``` + Or pass the key directly: -## Non-interactive example + ```bash + openclaw onboard --non-interactive \ + --mode local \ + --auth-choice gemini-api-key \ + --gemini-api-key "$GEMINI_API_KEY" + ``` + + + ```json5 + { + agents: { + defaults: { + model: { primary: "google/gemini-3.1-pro-preview" }, + }, + }, + } + ``` + + + ```bash + openclaw models list --provider google + ``` + + -```bash -openclaw onboard --non-interactive \ - --mode local \ - --auth-choice gemini-api-key \ - --gemini-api-key "$GEMINI_API_KEY" -``` + + The environment variables `GEMINI_API_KEY` and `GOOGLE_API_KEY` are both accepted. Use whichever you already have configured. + -## OAuth (Gemini CLI) + -An alternative provider `google-gemini-cli` uses PKCE OAuth instead of an API -key. This is an unofficial integration; some users report account -restrictions. Use at your own risk. + + **Best for:** reusing an existing Gemini CLI login via PKCE OAuth instead of a separate API key. -- Default model: `google-gemini-cli/gemini-3-flash-preview` -- Alias: `gemini-cli` -- Install prerequisite: local Gemini CLI available as `gemini` - - Homebrew: `brew install gemini-cli` - - npm: `npm install -g @google/gemini-cli` -- Login: + + The `google-gemini-cli` provider is an unofficial integration. Some users + report account restrictions when using OAuth this way. Use at your own risk. + -```bash -openclaw models auth login --provider google-gemini-cli --set-default -``` + + + The local `gemini` command must be available on `PATH`. -Environment variables: + ```bash + # Homebrew + brew install gemini-cli -- `OPENCLAW_GEMINI_OAUTH_CLIENT_ID` -- `OPENCLAW_GEMINI_OAUTH_CLIENT_SECRET` + # or npm + npm install -g @google/gemini-cli + ``` -(Or the `GEMINI_CLI_*` variants.) + OpenClaw supports both Homebrew installs and global npm installs, including + common Windows/npm layouts. + + + ```bash + openclaw models auth login --provider google-gemini-cli --set-default + ``` + + + ```bash + openclaw models list --provider google-gemini-cli + ``` + + -If Gemini CLI OAuth requests fail after login, set -`GOOGLE_CLOUD_PROJECT` or `GOOGLE_CLOUD_PROJECT_ID` on the gateway host and -retry. + - Default model: `google-gemini-cli/gemini-3-flash-preview` + - Alias: `gemini-cli` -If login fails before the browser flow starts, make sure the local `gemini` -command is installed and on `PATH`. OpenClaw supports both Homebrew installs -and global npm installs, including common Windows/npm layouts. + **Environment variables:** -Gemini CLI JSON usage notes: + - `OPENCLAW_GEMINI_OAUTH_CLIENT_ID` + - `OPENCLAW_GEMINI_OAUTH_CLIENT_SECRET` -- Reply text comes from the CLI JSON `response` field. -- Usage falls back to `stats` when the CLI leaves `usage` empty. -- `stats.cached` is normalized into OpenClaw `cacheRead`. -- If `stats.input` is missing, OpenClaw derives input tokens from - `stats.input_tokens - stats.cached`. + (Or the `GEMINI_CLI_*` variants.) + + + If Gemini CLI OAuth requests fail after login, set `GOOGLE_CLOUD_PROJECT` or + `GOOGLE_CLOUD_PROJECT_ID` on the gateway host and retry. + + + + If login fails before the browser flow starts, make sure the local `gemini` + command is installed and on `PATH`. + + + The OAuth-only `google-gemini-cli` provider is a separate text-inference + surface. Image generation, media understanding, and Gemini Grounding stay on + the `google` provider id. + + + ## Capabilities @@ -100,37 +140,12 @@ Gemini CLI JSON usage notes: | Thinking/reasoning | Yes (Gemini 3.1+) | | Gemma 4 models | Yes | -Gemma 4 models (for example `gemma-4-26b-a4b-it`) support thinking mode. OpenClaw rewrites `thinkingBudget` to a supported Google `thinkingLevel` for Gemma 4. Setting thinking to `off` preserves thinking disabled instead of mapping to `MINIMAL`. - -## Direct Gemini cache reuse - -For direct Gemini API runs (`api: "google-generative-ai"`), OpenClaw now -passes a configured `cachedContent` handle through to Gemini requests. - -- Configure per-model or global params with either - `cachedContent` or legacy `cached_content` -- If both are present, `cachedContent` wins -- Example value: `cachedContents/prebuilt-context` -- Gemini cache-hit usage is normalized into OpenClaw `cacheRead` from - upstream `cachedContentTokenCount` - -Example: - -```json5 -{ - agents: { - defaults: { - models: { - "google/gemini-2.5-pro": { - params: { - cachedContent: "cachedContents/prebuilt-context", - }, - }, - }, - }, - }, -} -``` + +Gemma 4 models (for example `gemma-4-26b-a4b-it`) support thinking mode. OpenClaw +rewrites `thinkingBudget` to a supported Google `thinkingLevel` for Gemma 4. +Setting thinking to `off` preserves thinking disabled instead of mapping to +`MINIMAL`. + ## Image generation @@ -142,10 +157,6 @@ The bundled `google` image-generation provider defaults to - Edit mode: enabled, up to 5 input images - Geometry controls: `size`, `aspectRatio`, and `resolution` -The OAuth-only `google-gemini-cli` provider is a separate text-inference -surface. Image generation, media understanding, and Gemini Grounding stay on -the `google` provider id. - To use Google as the default image provider: ```json5 @@ -160,8 +171,9 @@ To use Google as the default image provider: } ``` -See [Image Generation](/tools/image-generation) for the shared tool -parameters, provider selection, and failover behavior. + +See [Image Generation](/tools/image-generation) for shared tool parameters, provider selection, and failover behavior. + ## Video generation @@ -187,8 +199,9 @@ To use Google as the default video provider: } ``` -See [Video Generation](/tools/video-generation) for the shared tool -parameters, provider selection, and failover behavior. + +See [Video Generation](/tools/video-generation) for shared tool parameters, provider selection, and failover behavior. + ## Music generation @@ -216,11 +229,74 @@ To use Google as the default music provider: } ``` -See [Music Generation](/tools/music-generation) for the shared tool -parameters, provider selection, and failover behavior. + +See [Music Generation](/tools/music-generation) for shared tool parameters, provider selection, and failover behavior. + -## Environment note +## Advanced configuration -If the Gateway runs as a daemon (launchd/systemd), make sure `GEMINI_API_KEY` -is available to that process (for example, in `~/.openclaw/.env` or via -`env.shellEnv`). + + + For direct Gemini API runs (`api: "google-generative-ai"`), OpenClaw + passes a configured `cachedContent` handle through to Gemini requests. + + - Configure per-model or global params with either + `cachedContent` or legacy `cached_content` + - If both are present, `cachedContent` wins + - Example value: `cachedContents/prebuilt-context` + - Gemini cache-hit usage is normalized into OpenClaw `cacheRead` from + upstream `cachedContentTokenCount` + + ```json5 + { + agents: { + defaults: { + models: { + "google/gemini-2.5-pro": { + params: { + cachedContent: "cachedContents/prebuilt-context", + }, + }, + }, + }, + }, + } + ``` + + + + + When using the `google-gemini-cli` OAuth provider, OpenClaw normalizes + the CLI JSON output as follows: + + - Reply text comes from the CLI JSON `response` field. + - Usage falls back to `stats` when the CLI leaves `usage` empty. + - `stats.cached` is normalized into OpenClaw `cacheRead`. + - If `stats.input` is missing, OpenClaw derives input tokens from + `stats.input_tokens - stats.cached`. + + + + + If the Gateway runs as a daemon (launchd/systemd), make sure `GEMINI_API_KEY` + is available to that process (for example, in `~/.openclaw/.env` or via + `env.shellEnv`). + + + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Shared image tool parameters and provider selection. + + + Shared video tool parameters and provider selection. + + + Shared music tool parameters and provider selection. + + diff --git a/docs/providers/minimax.md b/docs/providers/minimax.md index 8772bb14740..22698fc6d87 100644 --- a/docs/providers/minimax.md +++ b/docs/providers/minimax.md @@ -12,31 +12,212 @@ OpenClaw's MiniMax provider defaults to **MiniMax M2.7**. MiniMax also provides: -- bundled speech synthesis via T2A v2 -- bundled image understanding via `MiniMax-VL-01` -- bundled music generation via `music-2.5+` -- bundled `web_search` through the MiniMax Coding Plan search API +- Bundled speech synthesis via T2A v2 +- Bundled image understanding via `MiniMax-VL-01` +- Bundled music generation via `music-2.5+` +- Bundled `web_search` through the MiniMax Coding Plan search API Provider split: -- `minimax`: API-key text provider, plus bundled image generation, image understanding, speech, and web search -- `minimax-portal`: OAuth text provider, plus bundled image generation and image understanding +| Provider ID | Auth | Capabilities | +| ---------------- | ------- | --------------------------------------------------------------- | +| `minimax` | API key | Text, image generation, image understanding, speech, web search | +| `minimax-portal` | OAuth | Text, image generation, image understanding | ## Model lineup -- `MiniMax-M2.7`: default hosted reasoning model. -- `MiniMax-M2.7-highspeed`: faster M2.7 reasoning tier. -- `image-01`: image generation model (generate and image-to-image editing). +| Model | Type | Description | +| ------------------------ | ---------------- | ---------------------------------------- | +| `MiniMax-M2.7` | Chat (reasoning) | Default hosted reasoning model | +| `MiniMax-M2.7-highspeed` | Chat (reasoning) | Faster M2.7 reasoning tier | +| `MiniMax-VL-01` | Vision | Image understanding model | +| `image-01` | Image generation | Text-to-image and image-to-image editing | +| `music-2.5+` | Music generation | Default music model | +| `music-2.5` | Music generation | Previous music generation tier | +| `music-2.0` | Music generation | Legacy music generation tier | +| `MiniMax-Hailuo-2.3` | Video generation | Text-to-video and image reference flows | -## Image generation +## Getting started + +Choose your preferred auth method and follow the setup steps. + + + + **Best for:** quick setup with MiniMax Coding Plan via OAuth, no API key required. + + + + + + ```bash + openclaw onboard --auth-choice minimax-global-oauth + ``` + + This authenticates against `api.minimax.io`. + + + ```bash + openclaw models list --provider minimax-portal + ``` + + + + + + + ```bash + openclaw onboard --auth-choice minimax-cn-oauth + ``` + + This authenticates against `api.minimaxi.com`. + + + ```bash + openclaw models list --provider minimax-portal + ``` + + + + + + + OAuth setups use the `minimax-portal` provider id. Model refs follow the form `minimax-portal/MiniMax-M2.7`. + + + + Referral link for MiniMax Coding Plan (10% off): [MiniMax Coding Plan](https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link) + + + + + + **Best for:** hosted MiniMax with Anthropic-compatible API. + + + + + + ```bash + openclaw onboard --auth-choice minimax-global-api + ``` + + This configures `api.minimax.io` as the base URL. + + + ```bash + openclaw models list --provider minimax + ``` + + + + + + + ```bash + openclaw onboard --auth-choice minimax-cn-api + ``` + + This configures `api.minimaxi.com` as the base URL. + + + ```bash + openclaw models list --provider minimax + ``` + + + + + + ### Config example + + ```json5 + { + env: { MINIMAX_API_KEY: "sk-..." }, + agents: { defaults: { model: { primary: "minimax/MiniMax-M2.7" } } }, + models: { + mode: "merge", + providers: { + minimax: { + baseUrl: "https://api.minimax.io/anthropic", + apiKey: "${MINIMAX_API_KEY}", + api: "anthropic-messages", + models: [ + { + id: "MiniMax-M2.7", + name: "MiniMax M2.7", + reasoning: true, + input: ["text", "image"], + cost: { input: 0.3, output: 1.2, cacheRead: 0.06, cacheWrite: 0.375 }, + contextWindow: 204800, + maxTokens: 131072, + }, + { + id: "MiniMax-M2.7-highspeed", + name: "MiniMax M2.7 Highspeed", + reasoning: true, + input: ["text", "image"], + cost: { input: 0.6, output: 2.4, cacheRead: 0.06, cacheWrite: 0.375 }, + contextWindow: 204800, + maxTokens: 131072, + }, + ], + }, + }, + }, + } + ``` + + + On the Anthropic-compatible streaming path, OpenClaw disables MiniMax thinking by default unless you explicitly set `thinking` yourself. MiniMax's streaming endpoint emits `reasoning_content` in OpenAI-style delta chunks instead of native Anthropic thinking blocks, which can leak internal reasoning into visible output if left enabled implicitly. + + + + API-key setups use the `minimax` provider id. Model refs follow the form `minimax/MiniMax-M2.7`. + + + + + +## Configure via `openclaw configure` + +Use the interactive config wizard to set MiniMax without editing JSON: + + + + ```bash + openclaw configure + ``` + + + Choose **Model/auth** from the menu. + + + Pick one of the available MiniMax options: + + | Auth choice | Description | + | --- | --- | + | `minimax-global-oauth` | International OAuth (Coding Plan) | + | `minimax-cn-oauth` | China OAuth (Coding Plan) | + | `minimax-global-api` | International API key | + | `minimax-cn-api` | China API key | + + + + Select your default model when prompted. + + + +## Capabilities + +### Image generation The MiniMax plugin registers the `image-01` model for the `image_generate` tool. It supports: -- **Text-to-image generation** with aspect ratio control. -- **Image-to-image editing** (subject reference) with aspect ratio control. -- Up to **9 output images** per request. -- Up to **1 reference image** per edit request. -- Supported aspect ratios: `1:1`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `21:9`. +- **Text-to-image generation** with aspect ratio control +- **Image-to-image editing** (subject reference) with aspect ratio control +- Up to **9 output images** per request +- Up to **1 reference image** per edit request +- Supported aspect ratios: `1:1`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `21:9` To use MiniMax for image generation, set it as the image generation provider: @@ -64,10 +245,11 @@ The built-in bundled MiniMax text catalog itself stays text-only metadata until that explicit provider config exists. Image understanding is exposed separately through the plugin-owned `MiniMax-VL-01` media provider. -See [Image Generation](/tools/image-generation) for the shared tool -parameters, provider selection, and failover behavior. + +See [Image Generation](/tools/image-generation) for shared tool parameters, provider selection, and failover behavior. + -## Music generation +### Music generation The bundled `minimax` plugin also registers music generation through the shared `music_generate` tool. @@ -92,10 +274,11 @@ To use MiniMax as the default music provider: } ``` -See [Music Generation](/tools/music-generation) for the shared tool -parameters, provider selection, and failover behavior. + +See [Music Generation](/tools/music-generation) for shared tool parameters, provider selection, and failover behavior. + -## Video generation +### Video generation The bundled `minimax` plugin also registers video generation through the shared `video_generate` tool. @@ -118,21 +301,24 @@ To use MiniMax as the default video provider: } ``` -See [Video Generation](/tools/video-generation) for the shared tool -parameters, provider selection, and failover behavior. + +See [Video Generation](/tools/video-generation) for shared tool parameters, provider selection, and failover behavior. + -## Image understanding +### Image understanding The MiniMax plugin registers image understanding separately from the text catalog: -- `minimax`: default image model `MiniMax-VL-01` -- `minimax-portal`: default image model `MiniMax-VL-01` +| Provider ID | Default image model | +| ---------------- | ------------------- | +| `minimax` | `MiniMax-VL-01` | +| `minimax-portal` | `MiniMax-VL-01` | That is why automatic media routing can use MiniMax image understanding even when the bundled text-provider catalog still shows text-only M2.7 chat refs. -## Web search +### Web search The MiniMax plugin also registers `web_search` through the MiniMax Coding Plan search API. @@ -146,136 +332,66 @@ search API. - Search stays on provider id `minimax`; OAuth CN/global setup can still steer region indirectly through `models.providers.minimax-portal.baseUrl` Config lives under `plugins.entries.minimax.config.webSearch.*`. -See [MiniMax Search](/tools/minimax-search). -## Choose a setup + +See [MiniMax Search](/tools/minimax-search) for full web search configuration and usage. + -### MiniMax OAuth (Coding Plan) - recommended +## Advanced configuration -**Best for:** quick setup with MiniMax Coding Plan via OAuth, no API key required. + + + | Option | Description | + | --- | --- | + | `models.providers.minimax.baseUrl` | Prefer `https://api.minimax.io/anthropic` (Anthropic-compatible); `https://api.minimax.io/v1` is optional for OpenAI-compatible payloads | + | `models.providers.minimax.api` | Prefer `anthropic-messages`; `openai-completions` is optional for OpenAI-compatible payloads | + | `models.providers.minimax.apiKey` | MiniMax API key (`MINIMAX_API_KEY`) | + | `models.providers.minimax.models` | Define `id`, `name`, `reasoning`, `contextWindow`, `maxTokens`, `cost` | + | `agents.defaults.models` | Alias models you want in the allowlist | + | `models.mode` | Keep `merge` if you want to add MiniMax alongside built-ins | + -Authenticate with the explicit regional OAuth choice: + + On `api: "anthropic-messages"`, OpenClaw injects `thinking: { type: "disabled" }` unless thinking is already explicitly set in params/config. -```bash -openclaw onboard --auth-choice minimax-global-oauth -# or -openclaw onboard --auth-choice minimax-cn-oauth -``` + This prevents MiniMax's streaming endpoint from emitting `reasoning_content` in OpenAI-style delta chunks, which would leak internal reasoning into visible output. -Choice mapping: + -- `minimax-global-oauth`: International users (`api.minimax.io`) -- `minimax-cn-oauth`: Users in China (`api.minimaxi.com`) + + `/fast on` or `params.fastMode: true` rewrites `MiniMax-M2.7` to `MiniMax-M2.7-highspeed` on the Anthropic-compatible stream path. + -See the MiniMax plugin package README in the OpenClaw repo for details. + + **Best for:** keep your strongest latest-generation model as primary, fail over to MiniMax M2.7. Example below uses Opus as a concrete primary; swap to your preferred latest-gen primary model. -### MiniMax M2.7 (API key) - -**Best for:** hosted MiniMax with Anthropic-compatible API. - -Configure via CLI: - -- Interactive onboarding: - -```bash -openclaw onboard --auth-choice minimax-global-api -# or -openclaw onboard --auth-choice minimax-cn-api -``` - -- `minimax-global-api`: International users (`api.minimax.io`) -- `minimax-cn-api`: Users in China (`api.minimaxi.com`) - -```json5 -{ - env: { MINIMAX_API_KEY: "sk-..." }, - agents: { defaults: { model: { primary: "minimax/MiniMax-M2.7" } } }, - models: { - mode: "merge", - providers: { - minimax: { - baseUrl: "https://api.minimax.io/anthropic", - apiKey: "${MINIMAX_API_KEY}", - api: "anthropic-messages", - models: [ - { - id: "MiniMax-M2.7", - name: "MiniMax M2.7", - reasoning: true, - input: ["text", "image"], - cost: { input: 0.3, output: 1.2, cacheRead: 0.06, cacheWrite: 0.375 }, - contextWindow: 204800, - maxTokens: 131072, + ```json5 + { + env: { MINIMAX_API_KEY: "sk-..." }, + agents: { + defaults: { + models: { + "anthropic/claude-opus-4-6": { alias: "primary" }, + "minimax/MiniMax-M2.7": { alias: "minimax" }, }, - { - id: "MiniMax-M2.7-highspeed", - name: "MiniMax M2.7 Highspeed", - reasoning: true, - input: ["text", "image"], - cost: { input: 0.6, output: 2.4, cacheRead: 0.06, cacheWrite: 0.375 }, - contextWindow: 204800, - maxTokens: 131072, + model: { + primary: "anthropic/claude-opus-4-6", + fallbacks: ["minimax/MiniMax-M2.7"], }, - ], + }, }, - }, - }, -} -``` + } + ``` -On the Anthropic-compatible streaming path, OpenClaw now disables MiniMax -thinking by default unless you explicitly set `thinking` yourself. MiniMax's -streaming endpoint emits `reasoning_content` in OpenAI-style delta chunks -instead of native Anthropic thinking blocks, which can leak internal reasoning -into visible output if left enabled implicitly. + -### MiniMax M2.7 as fallback (example) - -**Best for:** keep your strongest latest-generation model as primary, fail over to MiniMax M2.7. -Example below uses Opus as a concrete primary; swap to your preferred latest-gen primary model. - -```json5 -{ - env: { MINIMAX_API_KEY: "sk-..." }, - agents: { - defaults: { - models: { - "anthropic/claude-opus-4-6": { alias: "primary" }, - "minimax/MiniMax-M2.7": { alias: "minimax" }, - }, - model: { - primary: "anthropic/claude-opus-4-6", - fallbacks: ["minimax/MiniMax-M2.7"], - }, - }, - }, -} -``` - -## Configure via `openclaw configure` - -Use the interactive config wizard to set MiniMax without editing JSON: - -1. Run `openclaw configure`. -2. Select **Model/auth**. -3. Choose a **MiniMax** auth option. -4. Pick your default model when prompted. - -Current MiniMax auth choices in the wizard/CLI: - -- `minimax-global-oauth` -- `minimax-cn-oauth` -- `minimax-global-api` -- `minimax-cn-api` - -## Configuration options - -- `models.providers.minimax.baseUrl`: prefer `https://api.minimax.io/anthropic` (Anthropic-compatible); `https://api.minimax.io/v1` is optional for OpenAI-compatible payloads. -- `models.providers.minimax.api`: prefer `anthropic-messages`; `openai-completions` is optional for OpenAI-compatible payloads. -- `models.providers.minimax.apiKey`: MiniMax API key (`MINIMAX_API_KEY`). -- `models.providers.minimax.models`: define `id`, `name`, `reasoning`, `contextWindow`, `maxTokens`, `cost`. -- `agents.defaults.models`: alias models you want in the allowlist. -- `models.mode`: keep `merge` if you want to add MiniMax alongside built-ins. + + - Coding Plan usage API: `https://api.minimaxi.com/v1/api/openplatform/coding_plan/remains` (requires a coding plan key). + - OpenClaw normalizes MiniMax coding-plan usage to the same `% left` display used by other providers. MiniMax's raw `usage_percent` / `usagePercent` fields are remaining quota, not consumed quota, so OpenClaw inverts them. Count-based fields win when present. + - When the API returns `model_remains`, OpenClaw prefers the chat-model entry, derives the window label from `start_time` / `end_time` when needed, and includes the selected model name in the plan label so coding-plan windows are easier to distinguish. + - Usage snapshots treat `minimax`, `minimax-cn`, and `minimax-portal` as the same MiniMax quota surface, and prefer stored MiniMax OAuth before falling back to Coding Plan key env vars. + + ## Notes @@ -284,56 +400,67 @@ Current MiniMax auth choices in the wizard/CLI: - OAuth setup: `minimax-portal/` - Default chat model: `MiniMax-M2.7` - Alternate chat model: `MiniMax-M2.7-highspeed` -- On `api: "anthropic-messages"`, OpenClaw injects - `thinking: { type: "disabled" }` unless thinking is already explicitly set in - params/config. -- `/fast on` or `params.fastMode: true` rewrites `MiniMax-M2.7` to - `MiniMax-M2.7-highspeed` on the Anthropic-compatible stream path. -- Onboarding and direct API-key setup write explicit model definitions with - `input: ["text", "image"]` for both M2.7 variants -- The bundled provider catalog currently exposes the chat refs as text-only - metadata until explicit MiniMax provider config exists -- Coding Plan usage API: `https://api.minimaxi.com/v1/api/openplatform/coding_plan/remains` (requires a coding plan key). -- OpenClaw normalizes MiniMax coding-plan usage to the same `% left` display - used by other providers. MiniMax's raw `usage_percent` / `usagePercent` - fields are remaining quota, not consumed quota, so OpenClaw inverts them. - Count-based fields win when present. When the API returns `model_remains`, - OpenClaw prefers the chat-model entry, derives the window label from - `start_time` / `end_time` when needed, and includes the selected model name - in the plan label so coding-plan windows are easier to distinguish. -- Usage snapshots treat `minimax`, `minimax-cn`, and `minimax-portal` as the - same MiniMax quota surface, and prefer stored MiniMax OAuth before falling - back to Coding Plan key env vars. -- Update pricing values in `models.json` if you need exact cost tracking. -- Referral link for MiniMax Coding Plan (10% off): [https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link](https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link) -- See [/concepts/model-providers](/concepts/model-providers) for provider rules. -- Use `openclaw models list` to confirm the current provider id, then switch with - `openclaw models set minimax/MiniMax-M2.7` or - `openclaw models set minimax-portal/MiniMax-M2.7`. +- Onboarding and direct API-key setup write explicit model definitions with `input: ["text", "image"]` for both M2.7 variants +- The bundled provider catalog currently exposes the chat refs as text-only metadata until explicit MiniMax provider config exists +- Update pricing values in `models.json` if you need exact cost tracking +- Use `openclaw models list` to confirm the current provider id, then switch with `openclaw models set minimax/MiniMax-M2.7` or `openclaw models set minimax-portal/MiniMax-M2.7` + + +Referral link for MiniMax Coding Plan (10% off): [MiniMax Coding Plan](https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link) + + + +See [Model providers](/concepts/model-providers) for provider rules. + ## Troubleshooting -### "Unknown model: minimax/MiniMax-M2.7" + + + This usually means the **MiniMax provider is not configured** (no matching provider entry and no MiniMax auth profile/env key found). A fix for this detection is in **2026.1.12**. Fix by: -This usually means the **MiniMax provider isn’t configured** (no matching -provider entry and no MiniMax auth profile/env key found). A fix for this -detection is in **2026.1.12**. Fix by: + - Upgrading to **2026.1.12** (or run from source `main`), then restarting the gateway. + - Running `openclaw configure` and selecting a **MiniMax** auth option, or + - Adding the matching `models.providers.minimax` or `models.providers.minimax-portal` block manually, or + - Setting `MINIMAX_API_KEY`, `MINIMAX_OAUTH_TOKEN`, or a MiniMax auth profile so the matching provider can be injected. -- Upgrading to **2026.1.12** (or run from source `main`), then restarting the gateway. -- Running `openclaw configure` and selecting a **MiniMax** auth option, or -- Adding the matching `models.providers.minimax` or - `models.providers.minimax-portal` block manually, or -- Setting `MINIMAX_API_KEY`, `MINIMAX_OAUTH_TOKEN`, or a MiniMax auth profile - so the matching provider can be injected. + Make sure the model id is **case-sensitive**: -Make sure the model id is **case‑sensitive**: + - API-key path: `minimax/MiniMax-M2.7` or `minimax/MiniMax-M2.7-highspeed` + - OAuth path: `minimax-portal/MiniMax-M2.7` or `minimax-portal/MiniMax-M2.7-highspeed` -- API-key path: `minimax/MiniMax-M2.7` or `minimax/MiniMax-M2.7-highspeed` -- OAuth path: `minimax-portal/MiniMax-M2.7` or - `minimax-portal/MiniMax-M2.7-highspeed` + Then recheck with: -Then recheck with: + ```bash + openclaw models list + ``` -```bash -openclaw models list -``` + + + + +More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq). + + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Shared image tool parameters and provider selection. + + + Shared music tool parameters and provider selection. + + + Shared video tool parameters and provider selection. + + + Web search configuration via MiniMax Coding Plan. + + + General troubleshooting and FAQ. + + diff --git a/docs/providers/ollama.md b/docs/providers/ollama.md index c45dfb0fe73..d7ef535ad69 100644 --- a/docs/providers/ollama.md +++ b/docs/providers/ollama.md @@ -14,122 +14,154 @@ Ollama is a local LLM runtime that makes it easy to run open-source models on yo **Remote Ollama users**: Do not use the `/v1` OpenAI-compatible URL (`http://host:11434/v1`) with OpenClaw. This breaks tool calling and models may output raw tool JSON as plain text. Use the native Ollama API URL instead: `baseUrl: "http://host:11434"` (no `/v1`). -## Quick start +## Getting started -### Onboarding (recommended) +Choose your preferred setup method and mode. -The fastest way to set up Ollama is through onboarding: + + + **Best for:** fastest path to a working Ollama setup with automatic model discovery. -```bash -openclaw onboard -``` + + + ```bash + openclaw onboard + ``` -Select **Ollama** from the provider list. Onboarding will: + Select **Ollama** from the provider list. + + + - **Cloud + Local** — cloud-hosted models and local models together + - **Local** — local models only -1. Ask for the Ollama base URL where your instance can be reached (default `http://127.0.0.1:11434`). -2. Let you choose **Cloud + Local** (cloud models and local models) or **Local** (local models only). -3. Open a browser sign-in flow if you choose **Cloud + Local** and are not signed in to ollama.com. -4. Discover available models and suggest defaults. -5. Auto-pull the selected model if it is not available locally. + If you choose **Cloud + Local** and are not signed in to ollama.com, onboarding opens a browser sign-in flow. + + + Onboarding discovers available models and suggests defaults. It auto-pulls the selected model if it is not available locally. + + + ```bash + openclaw models list --provider ollama + ``` + + -Non-interactive mode is also supported: + ### Non-interactive mode -```bash -openclaw onboard --non-interactive \ - --auth-choice ollama \ - --accept-risk -``` + ```bash + openclaw onboard --non-interactive \ + --auth-choice ollama \ + --accept-risk + ``` -Optionally specify a custom base URL or model: + Optionally specify a custom base URL or model: -```bash -openclaw onboard --non-interactive \ - --auth-choice ollama \ - --custom-base-url "http://ollama-host:11434" \ - --custom-model-id "qwen3.5:27b" \ - --accept-risk -``` + ```bash + openclaw onboard --non-interactive \ + --auth-choice ollama \ + --custom-base-url "http://ollama-host:11434" \ + --custom-model-id "qwen3.5:27b" \ + --accept-risk + ``` -### Manual setup + -1. Install Ollama: [https://ollama.com/download](https://ollama.com/download) + + **Best for:** full control over installation, model pulls, and config. -2. Pull a local model if you want local inference: + + + Download from [ollama.com/download](https://ollama.com/download). + + + ```bash + ollama pull gemma4 + # or + ollama pull gpt-oss:20b + # or + ollama pull llama3.3 + ``` + + + If you want cloud models too: -```bash -ollama pull gemma4 -# or -ollama pull gpt-oss:20b -# or -ollama pull llama3.3 -``` + ```bash + ollama signin + ``` + + + Set any value for the API key (Ollama does not require a real key): -3. If you want cloud models too, sign in: + ```bash + # Set environment variable + export OLLAMA_API_KEY="ollama-local" -```bash -ollama signin -``` + # Or configure in your config file + openclaw config set models.providers.ollama.apiKey "ollama-local" + ``` + + + ```bash + openclaw models list + openclaw models set ollama/gemma4 + ``` -4. Run onboarding and choose `Ollama`: + Or set the default in config: -```bash -openclaw onboard -``` + ```json5 + { + agents: { + defaults: { + model: { primary: "ollama/gemma4" }, + }, + }, + } + ``` + + -- `Local`: local models only -- `Cloud + Local`: local models plus cloud models -- Cloud models such as `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, and `glm-5.1:cloud` do **not** require a local `ollama pull` + + -OpenClaw currently suggests: +## Cloud models -- local default: `gemma4` -- cloud defaults: `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, `glm-5.1:cloud` + + + Cloud models let you run cloud-hosted models alongside your local models. Examples include `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, and `glm-5.1:cloud` -- these do **not** require a local `ollama pull`. -5. If you prefer manual setup, enable Ollama for OpenClaw directly (any value works; Ollama doesn't require a real key): + Select **Cloud + Local** mode during setup. The wizard checks whether you are signed in and opens a browser sign-in flow when needed. If authentication cannot be verified, the wizard falls back to local model defaults. -```bash -# Set environment variable -export OLLAMA_API_KEY="ollama-local" + You can also sign in directly at [ollama.com/signin](https://ollama.com/signin). -# Or configure in your config file -openclaw config set models.providers.ollama.apiKey "ollama-local" -``` + OpenClaw currently suggests these cloud defaults: `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, `glm-5.1:cloud`. -6. Inspect or switch models: + -```bash -openclaw models list -openclaw models set ollama/gemma4 -``` + + In local-only mode, OpenClaw discovers models from the local Ollama instance. No cloud sign-in is needed. -7. Or set the default in config: + OpenClaw currently suggests `gemma4` as the local default. -```json5 -{ - agents: { - defaults: { - model: { primary: "ollama/gemma4" }, - }, - }, -} -``` + + ## Model discovery (implicit provider) -When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`: +When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`. -- Queries `/api/tags` -- Uses best-effort `/api/show` lookups to read `contextWindow` and detect capabilities (including vision) when available -- Models with a `vision` capability reported by `/api/show` are marked as image-capable (`input: ["text", "image"]`), so OpenClaw auto-injects images into the prompt for those models -- Marks `reasoning` with a model-name heuristic (`r1`, `reasoning`, `think`) -- Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw -- Sets all costs to `0` +| Behavior | Detail | +| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Catalog query | Queries `/api/tags` | +| Capability detection | Uses best-effort `/api/show` lookups to read `contextWindow` and detect capabilities (including vision) | +| Vision models | Models with a `vision` capability reported by `/api/show` are marked as image-capable (`input: ["text", "image"]`), so OpenClaw auto-injects images into the prompt | +| Reasoning detection | Marks `reasoning` with a model-name heuristic (`r1`, `reasoning`, `think`) | +| Token limits | Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw | +| Costs | Sets all costs to `0` | This avoids manual model entries while keeping the catalog aligned with the local Ollama instance. -To see what models are available: - ```bash +# See what models are available ollama list openclaw models list ``` @@ -142,74 +174,79 @@ ollama pull mistral The new model will be automatically discovered and available to use. -If you set `models.providers.ollama` explicitly, auto-discovery is skipped and you must define models manually (see below). + +If you set `models.providers.ollama` explicitly, auto-discovery is skipped and you must define models manually. See the explicit config section below. + ## Configuration -### Basic setup (implicit discovery) + + + The simplest way to enable Ollama is via environment variable: -The simplest way to enable Ollama is via environment variable: + ```bash + export OLLAMA_API_KEY="ollama-local" + ``` -```bash -export OLLAMA_API_KEY="ollama-local" -``` + + If `OLLAMA_API_KEY` is set, you can omit `apiKey` in the provider entry and OpenClaw will fill it for availability checks. + -### Explicit setup (manual models) + -Use explicit config when: + + Use explicit config when Ollama runs on another host/port, you want to force specific context windows or model lists, or you want fully manual model definitions. -- Ollama runs on another host/port. -- You want to force specific context windows or model lists. -- You want fully manual model definitions. - -```json5 -{ - models: { - providers: { - ollama: { - baseUrl: "http://ollama-host:11434", - apiKey: "ollama-local", - api: "ollama", - models: [ - { - id: "gpt-oss:20b", - name: "GPT-OSS 20B", - reasoning: false, - input: ["text"], - cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, - contextWindow: 8192, - maxTokens: 8192 * 10 + ```json5 + { + models: { + providers: { + ollama: { + baseUrl: "http://ollama-host:11434", + apiKey: "ollama-local", + api: "ollama", + models: [ + { + id: "gpt-oss:20b", + name: "GPT-OSS 20B", + reasoning: false, + input: ["text"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 8192, + maxTokens: 8192 * 10 + } + ] } - ] + } } } - } -} -``` + ``` -If `OLLAMA_API_KEY` is set, you can omit `apiKey` in the provider entry and OpenClaw will fill it for availability checks. + -### Custom base URL (explicit config) + + If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually): -If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually): - -```json5 -{ - models: { - providers: { - ollama: { - apiKey: "ollama-local", - baseUrl: "http://ollama-host:11434", // No /v1 - use native Ollama API URL - api: "ollama", // Set explicitly to guarantee native tool-calling behavior + ```json5 + { + models: { + providers: { + ollama: { + apiKey: "ollama-local", + baseUrl: "http://ollama-host:11434", // No /v1 - use native Ollama API URL + api: "ollama", // Set explicitly to guarantee native tool-calling behavior + }, + }, }, - }, - }, -} -``` + } + ``` - -Do not add `/v1` to the URL. The `/v1` path uses OpenAI-compatible mode, where tool calling is not reliable. Use the base Ollama URL without a path suffix. - + + Do not add `/v1` to the URL. The `/v1` path uses OpenAI-compatible mode, where tool calling is not reliable. Use the base Ollama URL without a path suffix. + + + + ### Model selection @@ -228,26 +265,17 @@ Once configured, all your Ollama models are available: } ``` -## Cloud models - -Cloud models let you run cloud-hosted models (for example `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, `glm-5.1:cloud`) alongside your local models. - -To use cloud models, select **Cloud + Local** mode during setup. The wizard checks whether you are signed in and opens a browser sign-in flow when needed. If authentication cannot be verified, the wizard falls back to local model defaults. - -You can also sign in directly at [ollama.com/signin](https://ollama.com/signin). - ## Ollama Web Search -OpenClaw also supports **Ollama Web Search** as a bundled `web_search` -provider. +OpenClaw supports **Ollama Web Search** as a bundled `web_search` provider. -- It uses your configured Ollama host (`models.providers.ollama.baseUrl` when - set, otherwise `http://127.0.0.1:11434`). -- It is key-free. -- It requires Ollama to be running and signed in with `ollama signin`. +| Property | Detail | +| ----------- | ----------------------------------------------------------------------------------------------------------------- | +| Host | Uses your configured Ollama host (`models.providers.ollama.baseUrl` when set, otherwise `http://127.0.0.1:11434`) | +| Auth | Key-free | +| Requirement | Ollama must be running and signed in with `ollama signin` | -Choose **Ollama Web Search** during `openclaw onboard` or -`openclaw configure --section web`, or set: +Choose **Ollama Web Search** during `openclaw onboard` or `openclaw configure --section web`, or set: ```json5 { @@ -261,120 +289,169 @@ Choose **Ollama Web Search** during `openclaw onboard` or } ``` + For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-search). + -## Advanced +## Advanced configuration -### Reasoning models + + + + **Tool calling is not reliable in OpenAI-compatible mode.** Use this mode only if you need OpenAI format for a proxy and do not depend on native tool calling behavior. + -OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default: + If you need to use the OpenAI-compatible endpoint instead (for example, behind a proxy that only supports OpenAI format), set `api: "openai-completions"` explicitly: -```bash -ollama pull deepseek-r1:32b -``` - -### Model Costs - -Ollama is free and runs locally, so all model costs are set to $0. - -### Streaming Configuration - -OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed. - -#### Legacy OpenAI-Compatible Mode - - -**Tool calling is not reliable in OpenAI-compatible mode.** Use this mode only if you need OpenAI format for a proxy and do not depend on native tool calling behavior. - - -If you need to use the OpenAI-compatible endpoint instead (e.g., behind a proxy that only supports OpenAI format), set `api: "openai-completions"` explicitly: - -```json5 -{ - models: { - providers: { - ollama: { - baseUrl: "http://ollama-host:11434/v1", - api: "openai-completions", - injectNumCtxForOpenAICompat: true, // default: true - apiKey: "ollama-local", - models: [...] + ```json5 + { + models: { + providers: { + ollama: { + baseUrl: "http://ollama-host:11434/v1", + api: "openai-completions", + injectNumCtxForOpenAICompat: true, // default: true + apiKey: "ollama-local", + models: [...] + } + } } } - } -} -``` + ``` -This mode may not support streaming + tool calling simultaneously. You may need to disable streaming with `params: { streaming: false }` in model config. + This mode may not support streaming and tool calling simultaneously. You may need to disable streaming with `params: { streaming: false }` in model config. -When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.num_ctx` by default so Ollama does not silently fall back to a 4096 context window. If your proxy/upstream rejects unknown `options` fields, disable this behavior: + When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.num_ctx` by default so Ollama does not silently fall back to a 4096 context window. If your proxy/upstream rejects unknown `options` fields, disable this behavior: -```json5 -{ - models: { - providers: { - ollama: { - baseUrl: "http://ollama-host:11434/v1", - api: "openai-completions", - injectNumCtxForOpenAICompat: false, - apiKey: "ollama-local", - models: [...] + ```json5 + { + models: { + providers: { + ollama: { + baseUrl: "http://ollama-host:11434/v1", + api: "openai-completions", + injectNumCtxForOpenAICompat: false, + apiKey: "ollama-local", + models: [...] + } + } } } - } -} -``` + ``` -### Context windows + -For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it falls back to the default Ollama context window used by OpenClaw. You can override `contextWindow` and `maxTokens` in explicit provider config. + + For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it falls back to the default Ollama context window used by OpenClaw. + + You can override `contextWindow` and `maxTokens` in explicit provider config: + + ```json5 + { + models: { + providers: { + ollama: { + models: [ + { + id: "llama3.3", + contextWindow: 131072, + maxTokens: 65536, + } + ] + } + } + } + } + ``` + + + + + OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default. + + ```bash + ollama pull deepseek-r1:32b + ``` + + No additional configuration is needed -- OpenClaw marks them automatically. + + + + + Ollama is free and runs locally, so all model costs are set to $0. This applies to both auto-discovered and manually defined models. + + + + OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed. + + + If you need to use the OpenAI-compatible endpoint, see the "Legacy OpenAI-compatible mode" section above. Streaming and tool calling may not work simultaneously in that mode. + + + + ## Troubleshooting -### Ollama not detected + + + Make sure Ollama is running and that you set `OLLAMA_API_KEY` (or an auth profile), and that you did **not** define an explicit `models.providers.ollama` entry: -Make sure Ollama is running and that you set `OLLAMA_API_KEY` (or an auth profile), and that you did **not** define an explicit `models.providers.ollama` entry: + ```bash + ollama serve + ``` -```bash -ollama serve -``` + Verify that the API is accessible: -And that the API is accessible: + ```bash + curl http://localhost:11434/api/tags + ``` -```bash -curl http://localhost:11434/api/tags -``` + -### No models available + + If your model is not listed, either pull the model locally or define it explicitly in `models.providers.ollama`. -If your model is not listed, either: + ```bash + ollama list # See what's installed + ollama pull gemma4 + ollama pull gpt-oss:20b + ollama pull llama3.3 # Or another model + ``` -- Pull the model locally, or -- Define the model explicitly in `models.providers.ollama`. + -To add models: + + Check that Ollama is running on the correct port: -```bash -ollama list # See what's installed -ollama pull gemma4 -ollama pull gpt-oss:20b -ollama pull llama3.3 # Or another model -``` + ```bash + # Check if Ollama is running + ps aux | grep ollama -### Connection refused + # Or restart Ollama + ollama serve + ``` -Check that Ollama is running on the correct port: + + -```bash -# Check if Ollama is running -ps aux | grep ollama + +More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq). + -# Or restart Ollama -ollama serve -``` +## Related -## See Also - -- [Model Providers](/concepts/model-providers) - Overview of all providers -- [Model Selection](/concepts/models) - How to choose models -- [Configuration](/gateway/configuration) - Full config reference + + + Overview of all providers, model refs, and failover behavior. + + + How to choose and configure models. + + + Full setup and behavior details for Ollama-powered web search. + + + Full config reference. + + diff --git a/docs/providers/venice.md b/docs/providers/venice.md index 6f3c4b9313d..fbb47fe2f3b 100644 --- a/docs/providers/venice.md +++ b/docs/providers/venice.md @@ -6,11 +6,9 @@ read_when: title: "Venice AI" --- -# Venice AI (Venice highlight) +# Venice AI -**Venice** is our highlight Venice setup for privacy-first inference with optional anonymized access to proprietary models. - -Venice AI provides privacy-focused AI inference with support for uncensored models and access to major proprietary models through their anonymized proxy. All inference is private by default—no training on your data, no logging. +Venice AI provides **privacy-focused AI inference** with support for uncensored models and access to major proprietary models through their anonymized proxy. All inference is private by default — no training on your data, no logging. ## Why Venice in OpenClaw @@ -19,7 +17,7 @@ Venice AI provides privacy-focused AI inference with support for uncensored mode - **Anonymized access** to proprietary models (Opus/GPT/Gemini) when quality matters. - OpenAI-compatible `/v1` endpoints. -## Privacy Modes +## Privacy modes Venice offers two privacy levels — understanding this is key to choosing your model: @@ -28,61 +26,67 @@ Venice offers two privacy levels — understanding this is key to choosing your | **Private** | Fully private. Prompts/responses are **never stored or logged**. Ephemeral. | Llama, Qwen, DeepSeek, Kimi, MiniMax, Venice Uncensored, etc. | | **Anonymized** | Proxied through Venice with metadata stripped. The underlying provider (OpenAI, Anthropic, Google, xAI) sees anonymized requests. | Claude, GPT, Gemini, Grok | + +Anonymized models are **not** fully private. Venice strips metadata before forwarding, but the underlying provider (OpenAI, Anthropic, Google, xAI) still processes the request. Choose **Private** models when full privacy is required. + + ## Features - **Privacy-focused**: Choose between "private" (fully private) and "anonymized" (proxied) modes - **Uncensored models**: Access to models without content restrictions - **Major model access**: Use Claude, GPT, Gemini, and Grok via Venice's anonymized proxy - **OpenAI-compatible API**: Standard `/v1` endpoints for easy integration -- **Streaming**: ✅ Supported on all models -- **Function calling**: ✅ Supported on select models (check model capabilities) -- **Vision**: ✅ Supported on models with vision capability +- **Streaming**: Supported on all models +- **Function calling**: Supported on select models (check model capabilities) +- **Vision**: Supported on models with vision capability - **No hard rate limits**: Fair-use throttling may apply for extreme usage -## Setup +## Getting started -### 1. Get API Key + + + 1. Sign up at [venice.ai](https://venice.ai) + 2. Go to **Settings > API Keys > Create new key** + 3. Copy your API key (format: `vapi_xxxxxxxxxxxx`) + + + Choose your preferred setup method: -1. Sign up at [venice.ai](https://venice.ai) -2. Go to **Settings → API Keys → Create new key** -3. Copy your API key (format: `vapi_xxxxxxxxxxxx`) + + + ```bash + openclaw onboard --auth-choice venice-api-key + ``` -### 2. Configure OpenClaw + This will: + 1. Prompt for your API key (or use existing `VENICE_API_KEY`) + 2. Show all available Venice models + 3. Let you pick your default model + 4. Configure the provider automatically + + + ```bash + export VENICE_API_KEY="vapi_xxxxxxxxxxxx" + ``` + + + ```bash + openclaw onboard --non-interactive \ + --auth-choice venice-api-key \ + --venice-api-key "vapi_xxxxxxxxxxxx" + ``` + + -**Option A: Environment Variable** + + + ```bash + openclaw agent --model venice/kimi-k2-5 --message "Hello, are you working?" + ``` + + -```bash -export VENICE_API_KEY="vapi_xxxxxxxxxxxx" -``` - -**Option B: Interactive Setup (Recommended)** - -```bash -openclaw onboard --auth-choice venice-api-key -``` - -This will: - -1. Prompt for your API key (or use existing `VENICE_API_KEY`) -2. Show all available Venice models -3. Let you pick your default model -4. Configure the provider automatically - -**Option C: Non-interactive** - -```bash -openclaw onboard --non-interactive \ - --auth-choice venice-api-key \ - --venice-api-key "vapi_xxxxxxxxxxxx" -``` - -### 3. Verify Setup - -```bash -openclaw agent --model venice/kimi-k2-5 --message "Hello, are you working?" -``` - -## Model Selection +## Model selection After setup, OpenClaw shows all available Venice models. Pick based on your needs: @@ -104,13 +108,10 @@ List all available models: openclaw models list | grep venice ``` -## Configure via `openclaw configure` +You can also run `openclaw configure`, select **Model/auth**, and choose **Venice AI**. -1. Run `openclaw configure` -2. Select **Model/auth** -3. Choose **Venice AI** - -## Which Model Should I Use? + +Use the table below to pick the right model for your use case. | Use Case | Recommended Model | Why | | -------------------------- | -------------------------------- | -------------------------------------------- | @@ -122,73 +123,77 @@ openclaw models list | grep venice | **Complex private tasks** | `deepseek-v3.2` | Strong reasoning, but no Venice tool support | | **Uncensored** | `venice-uncensored` | No content restrictions | -## Available Models (41 Total) + -### Private Models (26) - Fully Private, No Logging +## Available models (41 total) -| Model ID | Name | Context | Features | -| -------------------------------------- | ----------------------------------- | ------- | -------------------------- | -| `kimi-k2-5` | Kimi K2.5 | 256k | Default, reasoning, vision | -| `kimi-k2-thinking` | Kimi K2 Thinking | 256k | Reasoning | -| `llama-3.3-70b` | Llama 3.3 70B | 128k | General | -| `llama-3.2-3b` | Llama 3.2 3B | 128k | General | -| `hermes-3-llama-3.1-405b` | Hermes 3 Llama 3.1 405B | 128k | General, tools disabled | -| `qwen3-235b-a22b-thinking-2507` | Qwen3 235B Thinking | 128k | Reasoning | -| `qwen3-235b-a22b-instruct-2507` | Qwen3 235B Instruct | 128k | General | -| `qwen3-coder-480b-a35b-instruct` | Qwen3 Coder 480B | 256k | Coding | -| `qwen3-coder-480b-a35b-instruct-turbo` | Qwen3 Coder 480B Turbo | 256k | Coding | -| `qwen3-5-35b-a3b` | Qwen3.5 35B A3B | 256k | Reasoning, vision | -| `qwen3-next-80b` | Qwen3 Next 80B | 256k | General | -| `qwen3-vl-235b-a22b` | Qwen3 VL 235B (Vision) | 256k | Vision | -| `qwen3-4b` | Venice Small (Qwen3 4B) | 32k | Fast, reasoning | -| `deepseek-v3.2` | DeepSeek V3.2 | 160k | Reasoning, tools disabled | -| `venice-uncensored` | Venice Uncensored (Dolphin-Mistral) | 32k | Uncensored, tools disabled | -| `mistral-31-24b` | Venice Medium (Mistral) | 128k | Vision | -| `google-gemma-3-27b-it` | Google Gemma 3 27B Instruct | 198k | Vision | -| `openai-gpt-oss-120b` | OpenAI GPT OSS 120B | 128k | General | -| `nvidia-nemotron-3-nano-30b-a3b` | NVIDIA Nemotron 3 Nano 30B | 128k | General | -| `olafangensan-glm-4.7-flash-heretic` | GLM 4.7 Flash Heretic | 128k | Reasoning | -| `zai-org-glm-4.6` | GLM 4.6 | 198k | General | -| `zai-org-glm-4.7` | GLM 4.7 | 198k | Reasoning | -| `zai-org-glm-4.7-flash` | GLM 4.7 Flash | 128k | Reasoning | -| `zai-org-glm-5` | GLM 5 | 198k | Reasoning | -| `minimax-m21` | MiniMax M2.1 | 198k | Reasoning | -| `minimax-m25` | MiniMax M2.5 | 198k | Reasoning | + + + | Model ID | Name | Context | Features | + | -------------------------------------- | ----------------------------------- | ------- | -------------------------- | + | `kimi-k2-5` | Kimi K2.5 | 256k | Default, reasoning, vision | + | `kimi-k2-thinking` | Kimi K2 Thinking | 256k | Reasoning | + | `llama-3.3-70b` | Llama 3.3 70B | 128k | General | + | `llama-3.2-3b` | Llama 3.2 3B | 128k | General | + | `hermes-3-llama-3.1-405b` | Hermes 3 Llama 3.1 405B | 128k | General, tools disabled | + | `qwen3-235b-a22b-thinking-2507` | Qwen3 235B Thinking | 128k | Reasoning | + | `qwen3-235b-a22b-instruct-2507` | Qwen3 235B Instruct | 128k | General | + | `qwen3-coder-480b-a35b-instruct` | Qwen3 Coder 480B | 256k | Coding | + | `qwen3-coder-480b-a35b-instruct-turbo` | Qwen3 Coder 480B Turbo | 256k | Coding | + | `qwen3-5-35b-a3b` | Qwen3.5 35B A3B | 256k | Reasoning, vision | + | `qwen3-next-80b` | Qwen3 Next 80B | 256k | General | + | `qwen3-vl-235b-a22b` | Qwen3 VL 235B (Vision) | 256k | Vision | + | `qwen3-4b` | Venice Small (Qwen3 4B) | 32k | Fast, reasoning | + | `deepseek-v3.2` | DeepSeek V3.2 | 160k | Reasoning, tools disabled | + | `venice-uncensored` | Venice Uncensored (Dolphin-Mistral) | 32k | Uncensored, tools disabled | + | `mistral-31-24b` | Venice Medium (Mistral) | 128k | Vision | + | `google-gemma-3-27b-it` | Google Gemma 3 27B Instruct | 198k | Vision | + | `openai-gpt-oss-120b` | OpenAI GPT OSS 120B | 128k | General | + | `nvidia-nemotron-3-nano-30b-a3b` | NVIDIA Nemotron 3 Nano 30B | 128k | General | + | `olafangensan-glm-4.7-flash-heretic` | GLM 4.7 Flash Heretic | 128k | Reasoning | + | `zai-org-glm-4.6` | GLM 4.6 | 198k | General | + | `zai-org-glm-4.7` | GLM 4.7 | 198k | Reasoning | + | `zai-org-glm-4.7-flash` | GLM 4.7 Flash | 128k | Reasoning | + | `zai-org-glm-5` | GLM 5 | 198k | Reasoning | + | `minimax-m21` | MiniMax M2.1 | 198k | Reasoning | + | `minimax-m25` | MiniMax M2.5 | 198k | Reasoning | + -### Anonymized Models (15) - Via Venice Proxy + + | Model ID | Name | Context | Features | + | ------------------------------- | ------------------------------ | ------- | ------------------------- | + | `claude-opus-4-6` | Claude Opus 4.6 (via Venice) | 1M | Reasoning, vision | + | `claude-opus-4-5` | Claude Opus 4.5 (via Venice) | 198k | Reasoning, vision | + | `claude-sonnet-4-6` | Claude Sonnet 4.6 (via Venice) | 1M | Reasoning, vision | + | `claude-sonnet-4-5` | Claude Sonnet 4.5 (via Venice) | 198k | Reasoning, vision | + | `openai-gpt-54` | GPT-5.4 (via Venice) | 1M | Reasoning, vision | + | `openai-gpt-53-codex` | GPT-5.3 Codex (via Venice) | 400k | Reasoning, vision, coding | + | `openai-gpt-52` | GPT-5.2 (via Venice) | 256k | Reasoning | + | `openai-gpt-52-codex` | GPT-5.2 Codex (via Venice) | 256k | Reasoning, vision, coding | + | `openai-gpt-4o-2024-11-20` | GPT-4o (via Venice) | 128k | Vision | + | `openai-gpt-4o-mini-2024-07-18` | GPT-4o Mini (via Venice) | 128k | Vision | + | `gemini-3-1-pro-preview` | Gemini 3.1 Pro (via Venice) | 1M | Reasoning, vision | + | `gemini-3-pro-preview` | Gemini 3 Pro (via Venice) | 198k | Reasoning, vision | + | `gemini-3-flash-preview` | Gemini 3 Flash (via Venice) | 256k | Reasoning, vision | + | `grok-41-fast` | Grok 4.1 Fast (via Venice) | 1M | Reasoning, vision | + | `grok-code-fast-1` | Grok Code Fast 1 (via Venice) | 256k | Reasoning, coding | + + -| Model ID | Name | Context | Features | -| ------------------------------- | ------------------------------ | ------- | ------------------------- | -| `claude-opus-4-6` | Claude Opus 4.6 (via Venice) | 1M | Reasoning, vision | -| `claude-opus-4-5` | Claude Opus 4.5 (via Venice) | 198k | Reasoning, vision | -| `claude-sonnet-4-6` | Claude Sonnet 4.6 (via Venice) | 1M | Reasoning, vision | -| `claude-sonnet-4-5` | Claude Sonnet 4.5 (via Venice) | 198k | Reasoning, vision | -| `openai-gpt-54` | GPT-5.4 (via Venice) | 1M | Reasoning, vision | -| `openai-gpt-53-codex` | GPT-5.3 Codex (via Venice) | 400k | Reasoning, vision, coding | -| `openai-gpt-52` | GPT-5.2 (via Venice) | 256k | Reasoning | -| `openai-gpt-52-codex` | GPT-5.2 Codex (via Venice) | 256k | Reasoning, vision, coding | -| `openai-gpt-4o-2024-11-20` | GPT-4o (via Venice) | 128k | Vision | -| `openai-gpt-4o-mini-2024-07-18` | GPT-4o Mini (via Venice) | 128k | Vision | -| `gemini-3-1-pro-preview` | Gemini 3.1 Pro (via Venice) | 1M | Reasoning, vision | -| `gemini-3-pro-preview` | Gemini 3 Pro (via Venice) | 198k | Reasoning, vision | -| `gemini-3-flash-preview` | Gemini 3 Flash (via Venice) | 256k | Reasoning, vision | -| `grok-41-fast` | Grok 4.1 Fast (via Venice) | 1M | Reasoning, vision | -| `grok-code-fast-1` | Grok Code Fast 1 (via Venice) | 256k | Reasoning, coding | - -## Model Discovery +## Model discovery OpenClaw automatically discovers models from the Venice API when `VENICE_API_KEY` is set. If the API is unreachable, it falls back to a static catalog. The `/models` endpoint is public (no auth needed for listing), but inference requires a valid API key. -## Streaming & Tool Support +## Streaming and tool support -| Feature | Support | -| -------------------- | ------------------------------------------------------- | -| **Streaming** | ✅ All models | -| **Function calling** | ✅ Most models (check `supportsFunctionCalling` in API) | -| **Vision/Images** | ✅ Models marked with "Vision" feature | -| **JSON mode** | ✅ Supported via `response_format` | +| Feature | Support | +| -------------------- | ---------------------------------------------------- | +| **Streaming** | All models | +| **Function calling** | Most models (check `supportsFunctionCalling` in API) | +| **Vision/Images** | Models marked with "Vision" feature | +| **JSON mode** | Supported via `response_format` | ## Pricing @@ -197,7 +202,7 @@ Venice uses a credit-based system. Check [venice.ai/pricing](https://venice.ai/p - **Private models**: Generally lower cost - **Anonymized models**: Similar to direct API pricing + small Venice fee -## Comparison: Venice vs Direct API +### Venice (anonymized) vs direct API | Aspect | Venice (Anonymized) | Direct API | | ------------ | ----------------------------- | ------------------- | @@ -206,7 +211,7 @@ Venice uses a credit-based system. Check [venice.ai/pricing](https://venice.ai/p | **Features** | Most features supported | Full features | | **Billing** | Venice credits | Provider billing | -## Usage Examples +## Usage examples ```bash # Use the default private model @@ -227,56 +232,77 @@ openclaw agent --model venice/qwen3-coder-480b-a35b-instruct --message "Refactor ## Troubleshooting -### API key not recognized + + + ```bash + echo $VENICE_API_KEY + openclaw models list | grep venice + ``` -```bash -echo $VENICE_API_KEY -openclaw models list | grep venice -``` + Ensure the key starts with `vapi_`. -Ensure the key starts with `vapi_`. + -### Model not available + + The Venice model catalog updates dynamically. Run `openclaw models list` to see currently available models. Some models may be temporarily offline. + -The Venice model catalog updates dynamically. Run `openclaw models list` to see currently available models. Some models may be temporarily offline. + + Venice API is at `https://api.venice.ai/api/v1`. Ensure your network allows HTTPS connections. + + -### Connection issues + +More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq). + -Venice API is at `https://api.venice.ai/api/v1`. Ensure your network allows HTTPS connections. +## Advanced configuration -## Config file example - -```json5 -{ - env: { VENICE_API_KEY: "vapi_..." }, - agents: { defaults: { model: { primary: "venice/kimi-k2-5" } } }, - models: { - mode: "merge", - providers: { - venice: { - baseUrl: "https://api.venice.ai/api/v1", - apiKey: "${VENICE_API_KEY}", - api: "openai-completions", - models: [ - { - id: "kimi-k2-5", - name: "Kimi K2.5", - reasoning: true, - input: ["text", "image"], - cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, - contextWindow: 256000, - maxTokens: 65536, + + + ```json5 + { + env: { VENICE_API_KEY: "vapi_..." }, + agents: { defaults: { model: { primary: "venice/kimi-k2-5" } } }, + models: { + mode: "merge", + providers: { + venice: { + baseUrl: "https://api.venice.ai/api/v1", + apiKey: "${VENICE_API_KEY}", + api: "openai-completions", + models: [ + { + id: "kimi-k2-5", + name: "Kimi K2.5", + reasoning: true, + input: ["text", "image"], + cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, + contextWindow: 256000, + maxTokens: 65536, + }, + ], }, - ], + }, }, - }, - }, -} -``` + } + ``` + + -## Links +## Related -- [Venice AI](https://venice.ai) -- [API Documentation](https://docs.venice.ai) -- [Pricing](https://venice.ai/pricing) -- [Status](https://status.venice.ai) + + + Choosing providers, model refs, and failover behavior. + + + Venice AI homepage and account signup. + + + Venice API reference and developer docs. + + + Current Venice credit rates and plans. + +