mirror of
https://github.com/openclaw/openclaw.git
synced 2026-03-13 11:00:50 +00:00
* fix(ollama): inject num_ctx for OpenAI-compatible transport * fix(ollama): discover per-model context and preserve higher limits * fix(agents): prefer matching provider model for fallback limits * fix(types): require numeric token limits in provider model merge * fix(types): accept unknown payload in ollama num_ctx wrapper * fix(types): simplify ollama settled-result extraction * config(models): add provider flag for Ollama OpenAI num_ctx injection * config(schema): allow provider num_ctx injection flag * config(labels): label provider num_ctx injection flag * config(help): document provider num_ctx injection flag * agents(ollama): gate OpenAI num_ctx injection with provider config * tests(ollama): cover provider num_ctx injection flag behavior * docs(config): list provider num_ctx injection option * docs(ollama): document OpenAI num_ctx injection toggle * docs(config): clarify merge token-limit precedence * config(help): note merge uses higher model token limits * fix(ollama): cap /api/show discovery concurrency * fix(ollama): restrict num_ctx injection to OpenAI compat * tests(ollama): cover ipv6 and compat num_ctx gating * fix(ollama): detect remote compat endpoints for ollama-labeled providers * fix(ollama): cap per-model /api/show lookups to bound discovery load
283 lines
7.3 KiB
Markdown
283 lines
7.3 KiB
Markdown
---
|
||
summary: "Run OpenClaw with Ollama (local LLM runtime)"
|
||
read_when:
|
||
- You want to run OpenClaw with local models via Ollama
|
||
- You need Ollama setup and configuration guidance
|
||
title: "Ollama"
|
||
---
|
||
|
||
# Ollama
|
||
|
||
Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. OpenClaw integrates with Ollama's native API (`/api/chat`), supporting streaming and tool calling, and can **auto-discover tool-capable models** when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry.
|
||
|
||
<Warning>
|
||
**Remote Ollama users**: Do not use the `/v1` OpenAI-compatible URL (`http://host:11434/v1`) with OpenClaw. This breaks tool calling and models may output raw tool JSON as plain text. Use the native Ollama API URL instead: `baseUrl: "http://host:11434"` (no `/v1`).
|
||
</Warning>
|
||
|
||
## Quick start
|
||
|
||
1. Install Ollama: [https://ollama.ai](https://ollama.ai)
|
||
|
||
2. Pull a model:
|
||
|
||
```bash
|
||
ollama pull gpt-oss:20b
|
||
# or
|
||
ollama pull llama3.3
|
||
# or
|
||
ollama pull qwen2.5-coder:32b
|
||
# or
|
||
ollama pull deepseek-r1:32b
|
||
```
|
||
|
||
3. Enable Ollama for OpenClaw (any value works; Ollama doesn't require a real key):
|
||
|
||
```bash
|
||
# Set environment variable
|
||
export OLLAMA_API_KEY="ollama-local"
|
||
|
||
# Or configure in your config file
|
||
openclaw config set models.providers.ollama.apiKey "ollama-local"
|
||
```
|
||
|
||
4. Use Ollama models:
|
||
|
||
```json5
|
||
{
|
||
agents: {
|
||
defaults: {
|
||
model: { primary: "ollama/gpt-oss:20b" },
|
||
},
|
||
},
|
||
}
|
||
```
|
||
|
||
## Model discovery (implicit provider)
|
||
|
||
When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`:
|
||
|
||
- Queries `/api/tags` and `/api/show`
|
||
- Keeps only models that report `tools` capability
|
||
- Marks `reasoning` when the model reports `thinking`
|
||
- Reads `contextWindow` from `model_info["<arch>.context_length"]` when available
|
||
- Sets `maxTokens` to 10× the context window
|
||
- Sets all costs to `0`
|
||
|
||
This avoids manual model entries while keeping the catalog aligned with Ollama's capabilities.
|
||
|
||
To see what models are available:
|
||
|
||
```bash
|
||
ollama list
|
||
openclaw models list
|
||
```
|
||
|
||
To add a new model, simply pull it with Ollama:
|
||
|
||
```bash
|
||
ollama pull mistral
|
||
```
|
||
|
||
The new model will be automatically discovered and available to use.
|
||
|
||
If you set `models.providers.ollama` explicitly, auto-discovery is skipped and you must define models manually (see below).
|
||
|
||
## Configuration
|
||
|
||
### Basic setup (implicit discovery)
|
||
|
||
The simplest way to enable Ollama is via environment variable:
|
||
|
||
```bash
|
||
export OLLAMA_API_KEY="ollama-local"
|
||
```
|
||
|
||
### Explicit setup (manual models)
|
||
|
||
Use explicit config when:
|
||
|
||
- Ollama runs on another host/port.
|
||
- You want to force specific context windows or model lists.
|
||
- You want to include models that do not report tool support.
|
||
|
||
```json5
|
||
{
|
||
models: {
|
||
providers: {
|
||
ollama: {
|
||
baseUrl: "http://ollama-host:11434",
|
||
apiKey: "ollama-local",
|
||
api: "ollama",
|
||
models: [
|
||
{
|
||
id: "gpt-oss:20b",
|
||
name: "GPT-OSS 20B",
|
||
reasoning: false,
|
||
input: ["text"],
|
||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||
contextWindow: 8192,
|
||
maxTokens: 8192 * 10
|
||
}
|
||
]
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
If `OLLAMA_API_KEY` is set, you can omit `apiKey` in the provider entry and OpenClaw will fill it for availability checks.
|
||
|
||
### Custom base URL (explicit config)
|
||
|
||
If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually):
|
||
|
||
```json5
|
||
{
|
||
models: {
|
||
providers: {
|
||
ollama: {
|
||
apiKey: "ollama-local",
|
||
baseUrl: "http://ollama-host:11434", // No /v1 - use native Ollama API URL
|
||
api: "ollama", // Set explicitly to guarantee native tool-calling behavior
|
||
},
|
||
},
|
||
},
|
||
}
|
||
```
|
||
|
||
<Warning>
|
||
Do not add `/v1` to the URL. The `/v1` path uses OpenAI-compatible mode, where tool calling is not reliable. Use the base Ollama URL without a path suffix.
|
||
</Warning>
|
||
|
||
### Model selection
|
||
|
||
Once configured, all your Ollama models are available:
|
||
|
||
```json5
|
||
{
|
||
agents: {
|
||
defaults: {
|
||
model: {
|
||
primary: "ollama/gpt-oss:20b",
|
||
fallbacks: ["ollama/llama3.3", "ollama/qwen2.5-coder:32b"],
|
||
},
|
||
},
|
||
},
|
||
}
|
||
```
|
||
|
||
## Advanced
|
||
|
||
### Reasoning models
|
||
|
||
OpenClaw marks models as reasoning-capable when Ollama reports `thinking` in `/api/show`:
|
||
|
||
```bash
|
||
ollama pull deepseek-r1:32b
|
||
```
|
||
|
||
### Model Costs
|
||
|
||
Ollama is free and runs locally, so all model costs are set to $0.
|
||
|
||
### Streaming Configuration
|
||
|
||
OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed.
|
||
|
||
#### Legacy OpenAI-Compatible Mode
|
||
|
||
<Warning>
|
||
**Tool calling is not reliable in OpenAI-compatible mode.** Use this mode only if you need OpenAI format for a proxy and do not depend on native tool calling behavior.
|
||
</Warning>
|
||
|
||
If you need to use the OpenAI-compatible endpoint instead (e.g., behind a proxy that only supports OpenAI format), set `api: "openai-completions"` explicitly:
|
||
|
||
```json5
|
||
{
|
||
models: {
|
||
providers: {
|
||
ollama: {
|
||
baseUrl: "http://ollama-host:11434/v1",
|
||
api: "openai-completions",
|
||
injectNumCtxForOpenAICompat: true, // default: true
|
||
apiKey: "ollama-local",
|
||
models: [...]
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
This mode may not support streaming + tool calling simultaneously. You may need to disable streaming with `params: { streaming: false }` in model config.
|
||
|
||
When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.num_ctx` by default so Ollama does not silently fall back to a 4096 context window. If your proxy/upstream rejects unknown `options` fields, disable this behavior:
|
||
|
||
```json5
|
||
{
|
||
models: {
|
||
providers: {
|
||
ollama: {
|
||
baseUrl: "http://ollama-host:11434/v1",
|
||
api: "openai-completions",
|
||
injectNumCtxForOpenAICompat: false,
|
||
apiKey: "ollama-local",
|
||
models: [...]
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Context windows
|
||
|
||
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it defaults to `8192`. You can override `contextWindow` and `maxTokens` in explicit provider config.
|
||
|
||
## Troubleshooting
|
||
|
||
### Ollama not detected
|
||
|
||
Make sure Ollama is running and that you set `OLLAMA_API_KEY` (or an auth profile), and that you did **not** define an explicit `models.providers.ollama` entry:
|
||
|
||
```bash
|
||
ollama serve
|
||
```
|
||
|
||
And that the API is accessible:
|
||
|
||
```bash
|
||
curl http://localhost:11434/api/tags
|
||
```
|
||
|
||
### No models available
|
||
|
||
OpenClaw only auto-discovers models that report tool support. If your model isn't listed, either:
|
||
|
||
- Pull a tool-capable model, or
|
||
- Define the model explicitly in `models.providers.ollama`.
|
||
|
||
To add models:
|
||
|
||
```bash
|
||
ollama list # See what's installed
|
||
ollama pull gpt-oss:20b # Pull a tool-capable model
|
||
ollama pull llama3.3 # Or another model
|
||
```
|
||
|
||
### Connection refused
|
||
|
||
Check that Ollama is running on the correct port:
|
||
|
||
```bash
|
||
# Check if Ollama is running
|
||
ps aux | grep ollama
|
||
|
||
# Or restart Ollama
|
||
ollama serve
|
||
```
|
||
|
||
## See Also
|
||
|
||
- [Model Providers](/concepts/model-providers) - Overview of all providers
|
||
- [Model Selection](/concepts/models) - How to choose models
|
||
- [Configuration](/gateway/configuration) - Full config reference
|