23 KiB
summary, read_when, title
| summary | read_when | title | |||
|---|---|---|---|---|---|
| Run OpenClaw with Ollama (cloud and local models) |
|
Ollama |
OpenClaw integrates with Ollama's native API (/api/chat) for hosted cloud models and local/self-hosted Ollama servers. You can use Ollama in three modes: Cloud + Local through a reachable Ollama host, Cloud only against https://ollama.com, or Local only against a reachable Ollama host.
Ollama provider config uses baseUrl as the canonical key. OpenClaw also accepts baseURL for compatibility with OpenAI SDK-style examples, but new config should prefer baseUrl.
Auth rules
Local and LAN Ollama hosts do not need a real bearer token. OpenClaw uses the local `ollama-local` marker only for loopback, private-network, `.local`, and bare-hostname Ollama base URLs. Remote public hosts and Ollama Cloud (`https://ollama.com`) require a real credential through `OLLAMA_API_KEY`, an auth profile, or the provider's `apiKey`. Custom provider ids that set `api: "ollama"` follow the same rules. For example, an `ollama-remote` provider that points at a private LAN Ollama host can use `apiKey: "ollama-local"` and sub-agents will resolve that marker through the Ollama provider hook instead of treating it as a missing credential. When Ollama is used for memory embeddings, bearer auth is scoped to the host where it was declared:- A provider-level key is sent only to that provider's Ollama host.
- `agents.*.memorySearch.remote.apiKey` is sent only to its remote embedding host.
- A pure `OLLAMA_API_KEY` env value is treated as the Ollama Cloud convention, not sent to local or self-hosted hosts by default.
Getting started
Choose your preferred setup method and mode.
**Best for:** fastest path to a working Ollama cloud or local setup.<Steps>
<Step title="Run onboarding">
```bash
openclaw onboard
```
Select **Ollama** from the provider list.
</Step>
<Step title="Choose your mode">
- **Cloud + Local** — local Ollama host plus cloud models routed through that host
- **Cloud only** — hosted Ollama models via `https://ollama.com`
- **Local only** — local models only
</Step>
<Step title="Select a model">
`Cloud only` prompts for `OLLAMA_API_KEY` and suggests hosted cloud defaults. `Cloud + Local` and `Local only` ask for an Ollama base URL, discover available models, and auto-pull the selected local model if it is not available yet. `Cloud + Local` also checks whether that Ollama host is signed in for cloud access.
</Step>
<Step title="Verify the model is available">
```bash
openclaw models list --provider ollama
```
</Step>
</Steps>
### Non-interactive mode
```bash
openclaw onboard --non-interactive \
--auth-choice ollama \
--accept-risk
```
Optionally specify a custom base URL or model:
```bash
openclaw onboard --non-interactive \
--auth-choice ollama \
--custom-base-url "http://ollama-host:11434" \
--custom-model-id "qwen3.5:27b" \
--accept-risk
```
**Best for:** full control over cloud or local setup.
<Steps>
<Step title="Choose cloud or local">
- **Cloud + Local**: install Ollama, sign in with `ollama signin`, and route cloud requests through that host
- **Cloud only**: use `https://ollama.com` with an `OLLAMA_API_KEY`
- **Local only**: install Ollama from [ollama.com/download](https://ollama.com/download)
</Step>
<Step title="Pull a local model (local only)">
```bash
ollama pull gemma4
# or
ollama pull gpt-oss:20b
# or
ollama pull llama3.3
```
</Step>
<Step title="Enable Ollama for OpenClaw">
For `Cloud only`, use your real `OLLAMA_API_KEY`. For host-backed setups, any placeholder value works:
```bash
# Cloud
export OLLAMA_API_KEY="your-ollama-api-key"
# Local-only
export OLLAMA_API_KEY="ollama-local"
# Or configure in your config file
openclaw config set models.providers.ollama.apiKey "OLLAMA_API_KEY"
```
</Step>
<Step title="Inspect and set your model">
```bash
openclaw models list
openclaw models set ollama/gemma4
```
Or set the default in config:
```json5
{
agents: {
defaults: {
model: { primary: "ollama/gemma4" },
},
},
}
```
</Step>
</Steps>
Cloud models
`Cloud + Local` uses a reachable Ollama host as the control point for both local and cloud models. This is Ollama's preferred hybrid flow.Use **Cloud + Local** during setup. OpenClaw prompts for the Ollama base URL, discovers local models from that host, and checks whether the host is signed in for cloud access with `ollama signin`. When the host is signed in, OpenClaw also suggests hosted cloud defaults such as `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, and `glm-5.1:cloud`.
If the host is not signed in yet, OpenClaw keeps the setup local-only until you run `ollama signin`.
`Cloud only` runs against Ollama's hosted API at `https://ollama.com`.
Use **Cloud only** during setup. OpenClaw prompts for `OLLAMA_API_KEY`, sets `baseUrl: "https://ollama.com"`, and seeds the hosted cloud model list. This path does **not** require a local Ollama server or `ollama signin`.
The cloud model list shown during `openclaw onboard` is populated live from `https://ollama.com/api/tags`, capped at 500 entries, so the picker reflects the current hosted catalog rather than a static seed. If `ollama.com` is unreachable or returns no models at setup time, OpenClaw falls back to the previous hardcoded suggestions so onboarding still completes.
In local-only mode, OpenClaw discovers models from the configured Ollama instance. This path is for local or self-hosted Ollama servers.
OpenClaw currently suggests `gemma4` as the local default.
Model discovery (implicit provider)
When you set OLLAMA_API_KEY (or an auth profile) and do not define models.providers.ollama, OpenClaw discovers models from the local Ollama instance at http://127.0.0.1:11434.
| Behavior | Detail |
|---|---|
| Catalog query | Queries /api/tags |
| Capability detection | Uses best-effort /api/show lookups to read contextWindow, expanded num_ctx Modelfile parameters, and capabilities including vision/tools |
| Vision models | Models with a vision capability reported by /api/show are marked as image-capable (input: ["text", "image"]), so OpenClaw auto-injects images into the prompt |
| Reasoning detection | Marks reasoning with a model-name heuristic (r1, reasoning, think) |
| Token limits | Sets maxTokens to the default Ollama max-token cap used by OpenClaw |
| Costs | Sets all costs to 0 |
This avoids manual model entries while keeping the catalog aligned with the local Ollama instance.
# See what models are available
ollama list
openclaw models list
To add a new model, simply pull it with Ollama:
ollama pull mistral
The new model will be automatically discovered and available to use.
If you set `models.providers.ollama` explicitly, auto-discovery is skipped and you must define models manually. See the explicit config section below.Vision and image description
The bundled Ollama plugin registers Ollama as an image-capable media-understanding provider. This lets OpenClaw route explicit image-description requests and configured image-model defaults through local or hosted Ollama vision models.
For local vision, pull a model that supports images:
ollama pull qwen2.5vl:7b
export OLLAMA_API_KEY="ollama-local"
Then verify with the infer CLI:
openclaw infer image describe \
--file ./photo.jpg \
--model ollama/qwen2.5vl:7b \
--json
--model must be a full <provider/model> ref. When it is set, openclaw infer image describe runs that model directly instead of skipping description because the model supports native vision.
To make Ollama the default image-understanding model for inbound media, configure agents.defaults.imageModel:
{
agents: {
defaults: {
imageModel: {
primary: "ollama/qwen2.5vl:7b",
},
},
},
}
If you define models.providers.ollama.models manually, mark vision models with image input support:
{
id: "qwen2.5vl:7b",
name: "qwen2.5vl:7b",
input: ["text", "image"],
contextWindow: 128000,
maxTokens: 8192,
}
OpenClaw rejects image-description requests for models that are not marked image-capable. With implicit discovery, OpenClaw reads this from Ollama when /api/show reports a vision capability.
Configuration
The simplest local-only enablement path is via environment variable:```bash
export OLLAMA_API_KEY="ollama-local"
```
<Tip>
If `OLLAMA_API_KEY` is set, you can omit `apiKey` in the provider entry and OpenClaw will fill it for availability checks.
</Tip>
Use explicit config when you want hosted cloud setup, Ollama runs on another host/port, you want to force specific context windows or model lists, or you want fully manual model definitions.
```json5
{
models: {
providers: {
ollama: {
baseUrl: "https://ollama.com",
apiKey: "OLLAMA_API_KEY",
api: "ollama",
models: [
{
id: "kimi-k2.5:cloud",
name: "kimi-k2.5:cloud",
reasoning: false,
input: ["text", "image"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 128000,
maxTokens: 8192
}
]
}
}
}
}
```
If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually):
```json5
{
models: {
providers: {
ollama: {
apiKey: "ollama-local",
baseUrl: "http://ollama-host:11434", // No /v1 - use native Ollama API URL
api: "ollama", // Set explicitly to guarantee native tool-calling behavior
timeoutSeconds: 300, // Optional: give cold local models longer to connect and stream
models: [
{
id: "qwen3:32b",
name: "qwen3:32b",
params: {
keep_alive: "15m", // Optional: keep the model loaded between turns
},
},
],
},
},
},
}
```
<Warning>
Do not add `/v1` to the URL. The `/v1` path uses OpenAI-compatible mode, where tool calling is not reliable. Use the base Ollama URL without a path suffix.
</Warning>
Model selection
Once configured, all your Ollama models are available:
{
agents: {
defaults: {
model: {
primary: "ollama/gpt-oss:20b",
fallbacks: ["ollama/llama3.3", "ollama/qwen2.5-coder:32b"],
},
},
},
}
Custom Ollama provider ids are also supported. When a model ref uses the active
provider prefix, such as ollama-spark/qwen3:32b, OpenClaw strips only that
prefix before calling Ollama so the server receives qwen3:32b.
For slow local models, prefer provider-scoped request tuning before raising the whole agent runtime timeout:
{
models: {
providers: {
ollama: {
timeoutSeconds: 300,
models: [
{
id: "gemma4:26b",
name: "gemma4:26b",
params: { keep_alive: "15m" },
},
],
},
},
},
}
timeoutSeconds applies to the model HTTP request, including connection setup,
headers, body streaming, and the total guarded-fetch abort. params.keep_alive
is forwarded to Ollama as top-level keep_alive on native /api/chat requests;
set it per model when first-turn load time is the bottleneck.
Ollama Web Search
OpenClaw supports Ollama Web Search as a bundled web_search provider.
| Property | Detail |
|---|---|
| Host | Uses your configured Ollama host (models.providers.ollama.baseUrl when set, otherwise http://127.0.0.1:11434); https://ollama.com uses the hosted API directly |
| Auth | Key-free for signed-in local Ollama hosts; OLLAMA_API_KEY or configured provider auth for direct https://ollama.com search or auth-protected hosts |
| Requirement | Local/self-hosted hosts must be running and signed in with ollama signin; direct hosted search requires baseUrl: "https://ollama.com" plus a real Ollama API key |
Choose Ollama Web Search during openclaw onboard or openclaw configure --section web, or set:
{
tools: {
web: {
search: {
provider: "ollama",
},
},
},
}
Advanced configuration
**Tool calling is not reliable in OpenAI-compatible mode.** Use this mode only if you need OpenAI format for a proxy and do not depend on native tool calling behavior.If you need to use the OpenAI-compatible endpoint instead (for example, behind a proxy that only supports OpenAI format), set `api: "openai-completions"` explicitly:
```json5
{
models: {
providers: {
ollama: {
baseUrl: "http://ollama-host:11434/v1",
api: "openai-completions",
injectNumCtxForOpenAICompat: true, // default: true
apiKey: "ollama-local",
models: [...]
}
}
}
}
```
This mode may not support streaming and tool calling simultaneously. You may need to disable streaming with `params: { streaming: false }` in model config.
When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.num_ctx` by default so Ollama does not silently fall back to a 4096 context window. If your proxy/upstream rejects unknown `options` fields, disable this behavior:
```json5
{
models: {
providers: {
ollama: {
baseUrl: "http://ollama-host:11434/v1",
api: "openai-completions",
injectNumCtxForOpenAICompat: false,
apiKey: "ollama-local",
models: [...]
}
}
}
}
```
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, including larger `PARAMETER num_ctx` values from custom Modelfiles. Otherwise it falls back to the default Ollama context window used by OpenClaw.
You can override `contextWindow` and `maxTokens` in explicit provider config. To cap Ollama's per-request runtime context without rebuilding a Modelfile, set `params.num_ctx`; OpenClaw sends it as `options.num_ctx` for both native Ollama and the OpenAI-compatible Ollama adapter. Invalid, zero, negative, and non-finite values are ignored and fall back to `contextWindow`.
Native Ollama model entries also accept the common Ollama runtime options under `params`, including `temperature`, `top_p`, `top_k`, `min_p`, `num_predict`, `stop`, `repeat_penalty`, `num_batch`, `num_thread`, and `use_mmap`. OpenClaw forwards only Ollama request keys, so OpenClaw runtime params such as `streaming` are not leaked to Ollama. Use `params.think` or `params.thinking` to send top-level Ollama `think`; `false` disables API-level thinking for Qwen-style thinking models.
```json5
{
models: {
providers: {
ollama: {
models: [
{
id: "llama3.3",
contextWindow: 131072,
maxTokens: 65536,
params: {
num_ctx: 32768,
temperature: 0.7,
top_p: 0.9,
thinking: false,
},
}
]
}
}
}
}
```
Per-model `agents.defaults.models["ollama/<model>"].params.num_ctx` works too. If both are configured, the explicit provider model entry wins over the agent default.
OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default.
```bash
ollama pull deepseek-r1:32b
```
No additional configuration is needed. OpenClaw marks them automatically.
Ollama is free and runs locally, so all model costs are set to $0. This applies to both auto-discovered and manually defined models.
The bundled Ollama plugin registers a memory embedding provider for
[memory search](/concepts/memory). It uses the configured Ollama base URL
and API key, calls Ollama's current `/api/embed` endpoint, and batches
multiple memory chunks into one `input` request when possible.
| Property | Value |
| ------------- | ------------------- |
| Default model | `nomic-embed-text` |
| Auto-pull | Yes — the embedding model is pulled automatically if not present locally |
To select Ollama as the memory search embedding provider:
```json5
{
agents: {
defaults: {
memorySearch: { provider: "ollama" },
},
},
}
```
OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed.
For native `/api/chat` requests, OpenClaw also forwards thinking control directly to Ollama: `/think off` and `openclaw agent --thinking off` send top-level `think: false`, while `/think low|medium|high` send the matching top-level `think` effort string. `/think max` maps to Ollama's highest native effort, `think: "high"`.
<Tip>
If you need to use the OpenAI-compatible endpoint, see the "Legacy OpenAI-compatible mode" section above. Streaming and tool calling may not work simultaneously in that mode.
</Tip>
Troubleshooting
Make sure Ollama is running and that you set `OLLAMA_API_KEY` (or an auth profile), and that you did **not** define an explicit `models.providers.ollama` entry:```bash
ollama serve
```
Verify that the API is accessible:
```bash
curl http://localhost:11434/api/tags
```
If your model is not listed, either pull the model locally or define it explicitly in `models.providers.ollama`.
```bash
ollama list # See what's installed
ollama pull gemma4
ollama pull gpt-oss:20b
ollama pull llama3.3 # Or another model
```
Check that Ollama is running on the correct port:
```bash
# Check if Ollama is running
ps aux | grep ollama
# Or restart Ollama
ollama serve
```
Large local models can need a long first load before streaming begins. Keep the timeout scoped to the Ollama provider, and optionally ask Ollama to keep the model loaded between turns:
```json5
{
models: {
providers: {
ollama: {
timeoutSeconds: 300,
models: [
{
id: "gemma4:26b",
name: "gemma4:26b",
params: { keep_alive: "15m" },
},
],
},
},
},
}
```
If the host itself is slow to accept connections, `timeoutSeconds` also extends the guarded Undici connect timeout for this provider.
More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq).