mirror of
https://github.com/openclaw/openclaw.git
synced 2026-03-12 07:20:45 +00:00
docs(ollama): align onboarding guidance with code
This commit is contained in:
@@ -357,7 +357,7 @@ Ollama is a local LLM runtime that provides an OpenAI-compatible API:
|
|||||||
- Provider: `ollama`
|
- Provider: `ollama`
|
||||||
- Auth: None required (local server)
|
- Auth: None required (local server)
|
||||||
- Example model: `ollama/llama3.3`
|
- Example model: `ollama/llama3.3`
|
||||||
- Installation: [https://ollama.ai](https://ollama.ai)
|
- Installation: [https://ollama.com/download](https://ollama.com/download)
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Install Ollama, then pull a model:
|
# Install Ollama, then pull a model:
|
||||||
@@ -372,7 +372,7 @@ ollama pull llama3.3
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Ollama is automatically detected when running locally at `http://127.0.0.1:11434/v1`. See [/providers/ollama](/providers/ollama) for model recommendations and custom configuration.
|
Ollama is detected locally at `http://127.0.0.1:11434` when you opt in with `OLLAMA_API_KEY`, and `openclaw onboard` can configure it directly as a first-class provider. See [/providers/ollama](/providers/ollama) for onboarding, cloud/local mode, and custom configuration.
|
||||||
|
|
||||||
### vLLM
|
### vLLM
|
||||||
|
|
||||||
|
|||||||
@@ -11,6 +11,8 @@ title: "Local Models"
|
|||||||
|
|
||||||
Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).
|
Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).
|
||||||
|
|
||||||
|
If you want the lowest-friction local setup, start with [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.
|
||||||
|
|
||||||
## Recommended: LM Studio + MiniMax M2.5 (Responses API, full-size)
|
## Recommended: LM Studio + MiniMax M2.5 (Responses API, full-size)
|
||||||
|
|
||||||
Best current local stack. Load MiniMax M2.5 in LM Studio, enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.
|
Best current local stack. Load MiniMax M2.5 in LM Studio, enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.
|
||||||
|
|||||||
@@ -2084,8 +2084,21 @@ More context: [Models](/concepts/models).
|
|||||||
|
|
||||||
### Can I use selfhosted models llamacpp vLLM Ollama
|
### Can I use selfhosted models llamacpp vLLM Ollama
|
||||||
|
|
||||||
Yes. If your local server exposes an OpenAI-compatible API, you can point a
|
Yes. Ollama is the easiest path for local models.
|
||||||
custom provider at it. Ollama is supported directly and is the easiest path.
|
|
||||||
|
Quickest setup:
|
||||||
|
|
||||||
|
1. Install Ollama from `https://ollama.com/download`
|
||||||
|
2. Pull a local model such as `ollama pull glm-4.7-flash`
|
||||||
|
3. If you want Ollama Cloud too, run `ollama signin`
|
||||||
|
4. Run `openclaw onboard` and choose `Ollama`
|
||||||
|
5. Pick `Local` or `Cloud + Local`
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
|
||||||
|
- `Cloud + Local` gives you Ollama Cloud models plus your local Ollama models
|
||||||
|
- cloud models such as `kimi-k2.5:cloud` do not need a local pull
|
||||||
|
- for manual switching, use `openclaw models list` and `openclaw models set ollama/<model>`
|
||||||
|
|
||||||
Security note: smaller or heavily quantized models are more vulnerable to prompt
|
Security note: smaller or heavily quantized models are more vulnerable to prompt
|
||||||
injection. We strongly recommend **large models** for any bot that can use tools.
|
injection. We strongly recommend **large models** for any bot that can use tools.
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ title: "Ollama"
|
|||||||
|
|
||||||
# Ollama
|
# Ollama
|
||||||
|
|
||||||
Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. OpenClaw integrates with Ollama's native API (`/api/chat`), supporting streaming and tool calling, and can **auto-discover tool-capable models** when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry.
|
Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. OpenClaw integrates with Ollama's native API (`/api/chat`), supports streaming and tool calling, and can auto-discover local Ollama models when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry.
|
||||||
|
|
||||||
<Warning>
|
<Warning>
|
||||||
**Remote Ollama users**: Do not use the `/v1` OpenAI-compatible URL (`http://host:11434/v1`) with OpenClaw. This breaks tool calling and models may output raw tool JSON as plain text. Use the native Ollama API URL instead: `baseUrl: "http://host:11434"` (no `/v1`).
|
**Remote Ollama users**: Do not use the `/v1` OpenAI-compatible URL (`http://host:11434/v1`) with OpenClaw. This breaks tool calling and models may output raw tool JSON as plain text. Use the native Ollama API URL instead: `baseUrl: "http://host:11434"` (no `/v1`).
|
||||||
@@ -16,21 +16,40 @@ Ollama is a local LLM runtime that makes it easy to run open-source models on yo
|
|||||||
|
|
||||||
## Quick start
|
## Quick start
|
||||||
|
|
||||||
1. Install Ollama: [https://ollama.ai](https://ollama.ai)
|
1. Install Ollama: [https://ollama.com/download](https://ollama.com/download)
|
||||||
|
|
||||||
2. Pull a model:
|
2. Pull a local model if you want local inference:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
ollama pull glm-4.7-flash
|
||||||
|
# or
|
||||||
ollama pull gpt-oss:20b
|
ollama pull gpt-oss:20b
|
||||||
# or
|
# or
|
||||||
ollama pull llama3.3
|
ollama pull llama3.3
|
||||||
# or
|
|
||||||
ollama pull qwen2.5-coder:32b
|
|
||||||
# or
|
|
||||||
ollama pull deepseek-r1:32b
|
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Enable Ollama for OpenClaw (any value works; Ollama doesn't require a real key):
|
3. If you want Ollama Cloud models too, sign in:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
ollama signin
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Run onboarding and choose `Ollama`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openclaw onboard
|
||||||
|
```
|
||||||
|
|
||||||
|
- `Local`: local models only
|
||||||
|
- `Cloud + Local`: local models plus Ollama Cloud models
|
||||||
|
- Cloud models such as `kimi-k2.5:cloud`, `minimax-m2.5:cloud`, and `glm-5:cloud` do **not** require a local `ollama pull`
|
||||||
|
|
||||||
|
OpenClaw currently suggests:
|
||||||
|
|
||||||
|
- local default: `glm-4.7-flash`
|
||||||
|
- cloud defaults: `kimi-k2.5:cloud`, `minimax-m2.5:cloud`, `glm-5:cloud`
|
||||||
|
|
||||||
|
5. If you prefer manual setup, enable Ollama for OpenClaw directly (any value works; Ollama doesn't require a real key):
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Set environment variable
|
# Set environment variable
|
||||||
@@ -40,13 +59,20 @@ export OLLAMA_API_KEY="ollama-local"
|
|||||||
openclaw config set models.providers.ollama.apiKey "ollama-local"
|
openclaw config set models.providers.ollama.apiKey "ollama-local"
|
||||||
```
|
```
|
||||||
|
|
||||||
4. Use Ollama models:
|
6. Inspect or switch models:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
openclaw models list
|
||||||
|
openclaw models set ollama/glm-4.7-flash
|
||||||
|
```
|
||||||
|
|
||||||
|
7. Or set the default in config:
|
||||||
|
|
||||||
```json5
|
```json5
|
||||||
{
|
{
|
||||||
agents: {
|
agents: {
|
||||||
defaults: {
|
defaults: {
|
||||||
model: { primary: "ollama/gpt-oss:20b" },
|
model: { primary: "ollama/glm-4.7-flash" },
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
@@ -56,14 +82,13 @@ openclaw config set models.providers.ollama.apiKey "ollama-local"
|
|||||||
|
|
||||||
When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`:
|
When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`:
|
||||||
|
|
||||||
- Queries `/api/tags` and `/api/show`
|
- Queries `/api/tags`
|
||||||
- Keeps only models that report `tools` capability
|
- Uses best-effort `/api/show` lookups to read `contextWindow` when available
|
||||||
- Marks `reasoning` when the model reports `thinking`
|
- Marks `reasoning` with a model-name heuristic (`r1`, `reasoning`, `think`)
|
||||||
- Reads `contextWindow` from `model_info["<arch>.context_length"]` when available
|
- Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw
|
||||||
- Sets `maxTokens` to 10× the context window
|
|
||||||
- Sets all costs to `0`
|
- Sets all costs to `0`
|
||||||
|
|
||||||
This avoids manual model entries while keeping the catalog aligned with Ollama's capabilities.
|
This avoids manual model entries while keeping the catalog aligned with the local Ollama instance.
|
||||||
|
|
||||||
To see what models are available:
|
To see what models are available:
|
||||||
|
|
||||||
@@ -98,7 +123,7 @@ Use explicit config when:
|
|||||||
|
|
||||||
- Ollama runs on another host/port.
|
- Ollama runs on another host/port.
|
||||||
- You want to force specific context windows or model lists.
|
- You want to force specific context windows or model lists.
|
||||||
- You want to include models that do not report tool support.
|
- You want fully manual model definitions.
|
||||||
|
|
||||||
```json5
|
```json5
|
||||||
{
|
{
|
||||||
@@ -170,7 +195,7 @@ Once configured, all your Ollama models are available:
|
|||||||
|
|
||||||
### Reasoning models
|
### Reasoning models
|
||||||
|
|
||||||
OpenClaw marks models as reasoning-capable when Ollama reports `thinking` in `/api/show`:
|
OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ollama pull deepseek-r1:32b
|
ollama pull deepseek-r1:32b
|
||||||
@@ -230,7 +255,7 @@ When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.
|
|||||||
|
|
||||||
### Context windows
|
### Context windows
|
||||||
|
|
||||||
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it defaults to `8192`. You can override `contextWindow` and `maxTokens` in explicit provider config.
|
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it falls back to the default Ollama context window used by OpenClaw. You can override `contextWindow` and `maxTokens` in explicit provider config.
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
@@ -250,16 +275,17 @@ curl http://localhost:11434/api/tags
|
|||||||
|
|
||||||
### No models available
|
### No models available
|
||||||
|
|
||||||
OpenClaw only auto-discovers models that report tool support. If your model isn't listed, either:
|
If your model is not listed, either:
|
||||||
|
|
||||||
- Pull a tool-capable model, or
|
- Pull the model locally, or
|
||||||
- Define the model explicitly in `models.providers.ollama`.
|
- Define the model explicitly in `models.providers.ollama`.
|
||||||
|
|
||||||
To add models:
|
To add models:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ollama list # See what's installed
|
ollama list # See what's installed
|
||||||
ollama pull gpt-oss:20b # Pull a tool-capable model
|
ollama pull glm-4.7-flash
|
||||||
|
ollama pull gpt-oss:20b
|
||||||
ollama pull llama3.3 # Or another model
|
ollama pull llama3.3 # Or another model
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user