docs(ollama): align onboarding guidance with code

This commit is contained in:
Peter Steinberger
2026-03-11 20:08:45 +00:00
parent e65011dc29
commit 5e324cf785
4 changed files with 67 additions and 26 deletions

View File

@@ -357,7 +357,7 @@ Ollama is a local LLM runtime that provides an OpenAI-compatible API:
- Provider: `ollama`
- Auth: None required (local server)
- Example model: `ollama/llama3.3`
- Installation: [https://ollama.ai](https://ollama.ai)
- Installation: [https://ollama.com/download](https://ollama.com/download)
```bash
# Install Ollama, then pull a model:
@@ -372,7 +372,7 @@ ollama pull llama3.3
}
```
Ollama is automatically detected when running locally at `http://127.0.0.1:11434/v1`. See [/providers/ollama](/providers/ollama) for model recommendations and custom configuration.
Ollama is detected locally at `http://127.0.0.1:11434` when you opt in with `OLLAMA_API_KEY`, and `openclaw onboard` can configure it directly as a first-class provider. See [/providers/ollama](/providers/ollama) for onboarding, cloud/local mode, and custom configuration.
### vLLM

View File

@@ -11,6 +11,8 @@ title: "Local Models"
Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).
If you want the lowest-friction local setup, start with [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.
## Recommended: LM Studio + MiniMax M2.5 (Responses API, full-size)
Best current local stack. Load MiniMax M2.5 in LM Studio, enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.

View File

@@ -2084,8 +2084,21 @@ More context: [Models](/concepts/models).
### Can I use selfhosted models llamacpp vLLM Ollama
Yes. If your local server exposes an OpenAI-compatible API, you can point a
custom provider at it. Ollama is supported directly and is the easiest path.
Yes. Ollama is the easiest path for local models.
Quickest setup:
1. Install Ollama from `https://ollama.com/download`
2. Pull a local model such as `ollama pull glm-4.7-flash`
3. If you want Ollama Cloud too, run `ollama signin`
4. Run `openclaw onboard` and choose `Ollama`
5. Pick `Local` or `Cloud + Local`
Notes:
- `Cloud + Local` gives you Ollama Cloud models plus your local Ollama models
- cloud models such as `kimi-k2.5:cloud` do not need a local pull
- for manual switching, use `openclaw models list` and `openclaw models set ollama/<model>`
Security note: smaller or heavily quantized models are more vulnerable to prompt
injection. We strongly recommend **large models** for any bot that can use tools.

View File

@@ -8,7 +8,7 @@ title: "Ollama"
# Ollama
Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. OpenClaw integrates with Ollama's native API (`/api/chat`), supporting streaming and tool calling, and can **auto-discover tool-capable models** when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry.
Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. OpenClaw integrates with Ollama's native API (`/api/chat`), supports streaming and tool calling, and can auto-discover local Ollama models when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry.
<Warning>
**Remote Ollama users**: Do not use the `/v1` OpenAI-compatible URL (`http://host:11434/v1`) with OpenClaw. This breaks tool calling and models may output raw tool JSON as plain text. Use the native Ollama API URL instead: `baseUrl: "http://host:11434"` (no `/v1`).
@@ -16,21 +16,40 @@ Ollama is a local LLM runtime that makes it easy to run open-source models on yo
## Quick start
1. Install Ollama: [https://ollama.ai](https://ollama.ai)
1. Install Ollama: [https://ollama.com/download](https://ollama.com/download)
2. Pull a model:
2. Pull a local model if you want local inference:
```bash
ollama pull glm-4.7-flash
# or
ollama pull gpt-oss:20b
# or
ollama pull llama3.3
# or
ollama pull qwen2.5-coder:32b
# or
ollama pull deepseek-r1:32b
```
3. Enable Ollama for OpenClaw (any value works; Ollama doesn't require a real key):
3. If you want Ollama Cloud models too, sign in:
```bash
ollama signin
```
4. Run onboarding and choose `Ollama`:
```bash
openclaw onboard
```
- `Local`: local models only
- `Cloud + Local`: local models plus Ollama Cloud models
- Cloud models such as `kimi-k2.5:cloud`, `minimax-m2.5:cloud`, and `glm-5:cloud` do **not** require a local `ollama pull`
OpenClaw currently suggests:
- local default: `glm-4.7-flash`
- cloud defaults: `kimi-k2.5:cloud`, `minimax-m2.5:cloud`, `glm-5:cloud`
5. If you prefer manual setup, enable Ollama for OpenClaw directly (any value works; Ollama doesn't require a real key):
```bash
# Set environment variable
@@ -40,13 +59,20 @@ export OLLAMA_API_KEY="ollama-local"
openclaw config set models.providers.ollama.apiKey "ollama-local"
```
4. Use Ollama models:
6. Inspect or switch models:
```bash
openclaw models list
openclaw models set ollama/glm-4.7-flash
```
7. Or set the default in config:
```json5
{
agents: {
defaults: {
model: { primary: "ollama/gpt-oss:20b" },
model: { primary: "ollama/glm-4.7-flash" },
},
},
}
@@ -56,14 +82,13 @@ openclaw config set models.providers.ollama.apiKey "ollama-local"
When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`:
- Queries `/api/tags` and `/api/show`
- Keeps only models that report `tools` capability
- Marks `reasoning` when the model reports `thinking`
- Reads `contextWindow` from `model_info["<arch>.context_length"]` when available
- Sets `maxTokens` to 10× the context window
- Queries `/api/tags`
- Uses best-effort `/api/show` lookups to read `contextWindow` when available
- Marks `reasoning` with a model-name heuristic (`r1`, `reasoning`, `think`)
- Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw
- Sets all costs to `0`
This avoids manual model entries while keeping the catalog aligned with Ollama's capabilities.
This avoids manual model entries while keeping the catalog aligned with the local Ollama instance.
To see what models are available:
@@ -98,7 +123,7 @@ Use explicit config when:
- Ollama runs on another host/port.
- You want to force specific context windows or model lists.
- You want to include models that do not report tool support.
- You want fully manual model definitions.
```json5
{
@@ -170,7 +195,7 @@ Once configured, all your Ollama models are available:
### Reasoning models
OpenClaw marks models as reasoning-capable when Ollama reports `thinking` in `/api/show`:
OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default:
```bash
ollama pull deepseek-r1:32b
@@ -230,7 +255,7 @@ When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.
### Context windows
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it defaults to `8192`. You can override `contextWindow` and `maxTokens` in explicit provider config.
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it falls back to the default Ollama context window used by OpenClaw. You can override `contextWindow` and `maxTokens` in explicit provider config.
## Troubleshooting
@@ -250,16 +275,17 @@ curl http://localhost:11434/api/tags
### No models available
OpenClaw only auto-discovers models that report tool support. If your model isn't listed, either:
If your model is not listed, either:
- Pull a tool-capable model, or
- Pull the model locally, or
- Define the model explicitly in `models.providers.ollama`.
To add models:
```bash
ollama list # See what's installed
ollama pull gpt-oss:20b # Pull a tool-capable model
ollama pull glm-4.7-flash
ollama pull gpt-oss:20b
ollama pull llama3.3 # Or another model
```