mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 15:00:41 +00:00
docs(ollama): expand setup recipes
This commit is contained in:
@@ -335,6 +335,222 @@ OpenClaw rejects image-description requests for models that are not marked image
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Common recipes
|
||||
|
||||
Use these as starting points and replace model IDs with the exact names from `ollama list` or `openclaw models list --provider ollama`.
|
||||
|
||||
<AccordionGroup>
|
||||
<Accordion title="Local model with auto-discovery">
|
||||
Use this when Ollama runs on the same machine as the Gateway and you want OpenClaw to discover the installed models automatically.
|
||||
|
||||
```bash
|
||||
ollama serve
|
||||
ollama pull gemma4
|
||||
export OLLAMA_API_KEY="ollama-local"
|
||||
openclaw models list --provider ollama
|
||||
openclaw models set ollama/gemma4
|
||||
```
|
||||
|
||||
This path keeps config minimal. Do not add a `models.providers.ollama` block unless you want to define models manually.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="LAN Ollama host with manual models">
|
||||
Use native Ollama URLs for LAN hosts. Do not add `/v1`.
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://gpu-box.local:11434",
|
||||
apiKey: "ollama-local",
|
||||
api: "ollama",
|
||||
timeoutSeconds: 300,
|
||||
contextWindow: 32768,
|
||||
maxTokens: 8192,
|
||||
models: [
|
||||
{
|
||||
id: "qwen3.5:9b",
|
||||
name: "qwen3.5:9b",
|
||||
reasoning: true,
|
||||
input: ["text"],
|
||||
params: {
|
||||
num_ctx: 32768,
|
||||
keep_alive: "15m",
|
||||
},
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "ollama/qwen3.5:9b" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
`contextWindow` is the OpenClaw-side context budget. `params.num_ctx` is sent to Ollama for the request. Keep them aligned when your hardware cannot run the model's full advertised context.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Ollama Cloud only">
|
||||
Use this when you do not run a local daemon and want hosted Ollama models directly.
|
||||
|
||||
```bash
|
||||
export OLLAMA_API_KEY="your-ollama-api-key"
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "https://ollama.com",
|
||||
apiKey: "OLLAMA_API_KEY",
|
||||
api: "ollama",
|
||||
models: [
|
||||
{
|
||||
id: "kimi-k2.5:cloud",
|
||||
name: "kimi-k2.5:cloud",
|
||||
reasoning: false,
|
||||
input: ["text", "image"],
|
||||
contextWindow: 128000,
|
||||
maxTokens: 8192,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "ollama/kimi-k2.5:cloud" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Cloud plus local through a signed-in daemon">
|
||||
Use this when a local or LAN Ollama daemon is signed in with `ollama signin` and should serve both local models and `:cloud` models.
|
||||
|
||||
```bash
|
||||
ollama signin
|
||||
ollama pull gemma4
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://127.0.0.1:11434",
|
||||
apiKey: "ollama-local",
|
||||
api: "ollama",
|
||||
timeoutSeconds: 300,
|
||||
models: [
|
||||
{ id: "gemma4", name: "gemma4", input: ["text"] },
|
||||
{ id: "kimi-k2.5:cloud", name: "kimi-k2.5:cloud", input: ["text", "image"] },
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
agents: {
|
||||
defaults: {
|
||||
model: {
|
||||
primary: "ollama/gemma4",
|
||||
fallbacks: ["ollama/kimi-k2.5:cloud"],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Multiple Ollama hosts">
|
||||
Use custom provider IDs when you have more than one Ollama server. Each provider gets its own host, models, auth, timeout, and model refs.
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
"ollama-fast": {
|
||||
baseUrl: "http://mini.local:11434",
|
||||
apiKey: "ollama-local",
|
||||
api: "ollama",
|
||||
contextWindow: 32768,
|
||||
models: [{ id: "gemma4", name: "gemma4", input: ["text"] }],
|
||||
},
|
||||
"ollama-large": {
|
||||
baseUrl: "http://gpu-box.local:11434",
|
||||
apiKey: "ollama-local",
|
||||
api: "ollama",
|
||||
timeoutSeconds: 420,
|
||||
contextWindow: 131072,
|
||||
maxTokens: 16384,
|
||||
models: [{ id: "qwen3.5:27b", name: "qwen3.5:27b", input: ["text"] }],
|
||||
},
|
||||
},
|
||||
},
|
||||
agents: {
|
||||
defaults: {
|
||||
model: {
|
||||
primary: "ollama-fast/gemma4",
|
||||
fallbacks: ["ollama-large/qwen3.5:27b"],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
When OpenClaw sends the request, the active provider prefix is stripped so `ollama-large/qwen3.5:27b` reaches Ollama as `qwen3.5:27b`.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Lean local model profile">
|
||||
Some local models can answer simple prompts but struggle with the full agent tool surface. Start by limiting tools and context before changing global runtime settings.
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
experimental: {
|
||||
localModelLean: true,
|
||||
},
|
||||
model: { primary: "ollama/gemma4" },
|
||||
},
|
||||
},
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://127.0.0.1:11434",
|
||||
apiKey: "ollama-local",
|
||||
api: "ollama",
|
||||
contextWindow: 32768,
|
||||
models: [
|
||||
{
|
||||
id: "gemma4",
|
||||
name: "gemma4",
|
||||
input: ["text"],
|
||||
params: { num_ctx: 32768 },
|
||||
compat: { supportsTools: false },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Use `compat.supportsTools: false` only when the model or server reliably fails on tool schemas. It trades agent capability for stability.
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### Model selection
|
||||
|
||||
Once configured, all your Ollama models are available:
|
||||
@@ -383,6 +599,24 @@ headers, body streaming, and the total guarded-fetch abort. `params.keep_alive`
|
||||
is forwarded to Ollama as top-level `keep_alive` on native `/api/chat` requests;
|
||||
set it per model when first-turn load time is the bottleneck.
|
||||
|
||||
### Quick verification
|
||||
|
||||
```bash
|
||||
# Ollama daemon visible to this machine
|
||||
curl http://127.0.0.1:11434/api/tags
|
||||
|
||||
# OpenClaw catalog and selected model
|
||||
openclaw models list --provider ollama
|
||||
openclaw models status
|
||||
|
||||
# Direct model smoke
|
||||
openclaw infer model run \
|
||||
--model ollama/gemma4 \
|
||||
--prompt "Reply with exactly: ok"
|
||||
```
|
||||
|
||||
For remote hosts, replace `127.0.0.1` with the host used in `baseUrl`. If `curl` works but OpenClaw does not, check whether the Gateway runs on a different machine, container, or service account.
|
||||
|
||||
## Ollama Web Search
|
||||
|
||||
OpenClaw supports **Ollama Web Search** as a bundled `web_search` provider.
|
||||
@@ -407,6 +641,30 @@ Choose **Ollama Web Search** during `openclaw onboard` or `openclaw configure --
|
||||
}
|
||||
```
|
||||
|
||||
For direct hosted search through Ollama Cloud:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "https://ollama.com",
|
||||
apiKey: "OLLAMA_API_KEY",
|
||||
api: "ollama",
|
||||
models: [{ id: "kimi-k2.5:cloud", name: "kimi-k2.5:cloud", input: ["text"] }],
|
||||
},
|
||||
},
|
||||
},
|
||||
tools: {
|
||||
web: {
|
||||
search: { provider: "ollama" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
For a signed-in local daemon, OpenClaw uses the daemon's `/api/experimental/web_search` proxy. For `https://ollama.com`, it calls the hosted `/api/web_search` endpoint directly.
|
||||
|
||||
<Note>
|
||||
For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-search).
|
||||
</Note>
|
||||
@@ -495,6 +753,34 @@ For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-s
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Thinking control">
|
||||
For native Ollama models, OpenClaw forwards thinking control as Ollama expects it: top-level `think`, not `options.think`.
|
||||
|
||||
```bash
|
||||
openclaw agent --model ollama/gemma4 --thinking off
|
||||
openclaw agent --model ollama/gemma4 --thinking low
|
||||
```
|
||||
|
||||
You can also set a model default:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"ollama/gemma4": {
|
||||
thinking: "low",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Per-model `params.think` or `params.thinking` can disable or force Ollama API thinking for a specific configured model. Runtime commands such as `/think off` still apply to the active run.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Reasoning models">
|
||||
OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default.
|
||||
|
||||
@@ -533,6 +819,25 @@ For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-s
|
||||
}
|
||||
```
|
||||
|
||||
For a remote embedding host, keep auth scoped to that host:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
memorySearch: {
|
||||
provider: "ollama",
|
||||
remote: {
|
||||
baseUrl: "http://gpu-box.local:11434",
|
||||
model: "nomic-embed-text",
|
||||
apiKey: "ollama-local",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Streaming configuration">
|
||||
@@ -590,6 +895,45 @@ For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-s
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Remote host works with curl but not OpenClaw">
|
||||
Verify from the same machine and runtime that runs the Gateway:
|
||||
|
||||
```bash
|
||||
openclaw gateway status --deep
|
||||
curl http://ollama-host:11434/api/tags
|
||||
```
|
||||
|
||||
Common causes:
|
||||
|
||||
- `baseUrl` points at `localhost`, but the Gateway runs in Docker or on another host.
|
||||
- The URL uses `/v1`, which selects OpenAI-compatible behavior instead of native Ollama.
|
||||
- The remote host needs firewall or LAN binding changes on the Ollama side.
|
||||
- The model is present on your laptop's daemon but not on the remote daemon.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Model outputs tool JSON as text">
|
||||
This usually means the provider is using OpenAI-compatible mode or the model cannot handle tool schemas.
|
||||
|
||||
Prefer native Ollama mode:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://ollama-host:11434",
|
||||
api: "ollama",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
If a small local model still fails on tool schemas, set `compat.supportsTools: false` on that model entry and retest.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Cold local model times out">
|
||||
Large local models can need a long first load before streaming begins. Keep the timeout scoped to the Ollama provider, and optionally ask Ollama to keep the model loaded between turns:
|
||||
|
||||
@@ -615,6 +959,33 @@ For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-s
|
||||
If the host itself is slow to accept connections, `timeoutSeconds` also extends the guarded Undici connect timeout for this provider.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Large-context model is too slow or runs out of memory">
|
||||
Many Ollama models advertise contexts that are larger than your hardware can run comfortably. Cap both OpenClaw's budget and Ollama's request context:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
contextWindow: 32768,
|
||||
maxTokens: 8192,
|
||||
models: [
|
||||
{
|
||||
id: "qwen3.5:9b",
|
||||
name: "qwen3.5:9b",
|
||||
params: { num_ctx: 32768 },
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Lower `contextWindow` first if the prompt ingestion phase is slow. Lower `maxTokens` if generation runs too long.
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
<Note>
|
||||
|
||||
Reference in New Issue
Block a user