From 5e324cf7854ad5046f189e327e46843f2266ed46 Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Wed, 11 Mar 2026 20:08:45 +0000 Subject: [PATCH] docs(ollama): align onboarding guidance with code --- docs/concepts/model-providers.md | 4 +- docs/gateway/local-models.md | 2 + docs/help/faq.md | 17 +++++++- docs/providers/ollama.md | 70 ++++++++++++++++++++++---------- 4 files changed, 67 insertions(+), 26 deletions(-) diff --git a/docs/concepts/model-providers.md b/docs/concepts/model-providers.md index 4f3d80b2420..549875c77b4 100644 --- a/docs/concepts/model-providers.md +++ b/docs/concepts/model-providers.md @@ -357,7 +357,7 @@ Ollama is a local LLM runtime that provides an OpenAI-compatible API: - Provider: `ollama` - Auth: None required (local server) - Example model: `ollama/llama3.3` -- Installation: [https://ollama.ai](https://ollama.ai) +- Installation: [https://ollama.com/download](https://ollama.com/download) ```bash # Install Ollama, then pull a model: @@ -372,7 +372,7 @@ ollama pull llama3.3 } ``` -Ollama is automatically detected when running locally at `http://127.0.0.1:11434/v1`. See [/providers/ollama](/providers/ollama) for model recommendations and custom configuration. +Ollama is detected locally at `http://127.0.0.1:11434` when you opt in with `OLLAMA_API_KEY`, and `openclaw onboard` can configure it directly as a first-class provider. See [/providers/ollama](/providers/ollama) for onboarding, cloud/local mode, and custom configuration. ### vLLM diff --git a/docs/gateway/local-models.md b/docs/gateway/local-models.md index 8a07a827467..4059f988776 100644 --- a/docs/gateway/local-models.md +++ b/docs/gateway/local-models.md @@ -11,6 +11,8 @@ title: "Local Models" Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)). +If you want the lowest-friction local setup, start with [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers. + ## Recommended: LM Studio + MiniMax M2.5 (Responses API, full-size) Best current local stack. Load MiniMax M2.5 in LM Studio, enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text. diff --git a/docs/help/faq.md b/docs/help/faq.md index 8b738b60fc2..453688c1c5f 100644 --- a/docs/help/faq.md +++ b/docs/help/faq.md @@ -2084,8 +2084,21 @@ More context: [Models](/concepts/models). ### Can I use selfhosted models llamacpp vLLM Ollama -Yes. If your local server exposes an OpenAI-compatible API, you can point a -custom provider at it. Ollama is supported directly and is the easiest path. +Yes. Ollama is the easiest path for local models. + +Quickest setup: + +1. Install Ollama from `https://ollama.com/download` +2. Pull a local model such as `ollama pull glm-4.7-flash` +3. If you want Ollama Cloud too, run `ollama signin` +4. Run `openclaw onboard` and choose `Ollama` +5. Pick `Local` or `Cloud + Local` + +Notes: + +- `Cloud + Local` gives you Ollama Cloud models plus your local Ollama models +- cloud models such as `kimi-k2.5:cloud` do not need a local pull +- for manual switching, use `openclaw models list` and `openclaw models set ollama/` Security note: smaller or heavily quantized models are more vulnerable to prompt injection. We strongly recommend **large models** for any bot that can use tools. diff --git a/docs/providers/ollama.md b/docs/providers/ollama.md index b82f6411b68..abc41361ed0 100644 --- a/docs/providers/ollama.md +++ b/docs/providers/ollama.md @@ -8,7 +8,7 @@ title: "Ollama" # Ollama -Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. OpenClaw integrates with Ollama's native API (`/api/chat`), supporting streaming and tool calling, and can **auto-discover tool-capable models** when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry. +Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. OpenClaw integrates with Ollama's native API (`/api/chat`), supports streaming and tool calling, and can auto-discover local Ollama models when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry. **Remote Ollama users**: Do not use the `/v1` OpenAI-compatible URL (`http://host:11434/v1`) with OpenClaw. This breaks tool calling and models may output raw tool JSON as plain text. Use the native Ollama API URL instead: `baseUrl: "http://host:11434"` (no `/v1`). @@ -16,21 +16,40 @@ Ollama is a local LLM runtime that makes it easy to run open-source models on yo ## Quick start -1. Install Ollama: [https://ollama.ai](https://ollama.ai) +1. Install Ollama: [https://ollama.com/download](https://ollama.com/download) -2. Pull a model: +2. Pull a local model if you want local inference: ```bash +ollama pull glm-4.7-flash +# or ollama pull gpt-oss:20b # or ollama pull llama3.3 -# or -ollama pull qwen2.5-coder:32b -# or -ollama pull deepseek-r1:32b ``` -3. Enable Ollama for OpenClaw (any value works; Ollama doesn't require a real key): +3. If you want Ollama Cloud models too, sign in: + +```bash +ollama signin +``` + +4. Run onboarding and choose `Ollama`: + +```bash +openclaw onboard +``` + +- `Local`: local models only +- `Cloud + Local`: local models plus Ollama Cloud models +- Cloud models such as `kimi-k2.5:cloud`, `minimax-m2.5:cloud`, and `glm-5:cloud` do **not** require a local `ollama pull` + +OpenClaw currently suggests: + +- local default: `glm-4.7-flash` +- cloud defaults: `kimi-k2.5:cloud`, `minimax-m2.5:cloud`, `glm-5:cloud` + +5. If you prefer manual setup, enable Ollama for OpenClaw directly (any value works; Ollama doesn't require a real key): ```bash # Set environment variable @@ -40,13 +59,20 @@ export OLLAMA_API_KEY="ollama-local" openclaw config set models.providers.ollama.apiKey "ollama-local" ``` -4. Use Ollama models: +6. Inspect or switch models: + +```bash +openclaw models list +openclaw models set ollama/glm-4.7-flash +``` + +7. Or set the default in config: ```json5 { agents: { defaults: { - model: { primary: "ollama/gpt-oss:20b" }, + model: { primary: "ollama/glm-4.7-flash" }, }, }, } @@ -56,14 +82,13 @@ openclaw config set models.providers.ollama.apiKey "ollama-local" When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`: -- Queries `/api/tags` and `/api/show` -- Keeps only models that report `tools` capability -- Marks `reasoning` when the model reports `thinking` -- Reads `contextWindow` from `model_info[".context_length"]` when available -- Sets `maxTokens` to 10× the context window +- Queries `/api/tags` +- Uses best-effort `/api/show` lookups to read `contextWindow` when available +- Marks `reasoning` with a model-name heuristic (`r1`, `reasoning`, `think`) +- Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw - Sets all costs to `0` -This avoids manual model entries while keeping the catalog aligned with Ollama's capabilities. +This avoids manual model entries while keeping the catalog aligned with the local Ollama instance. To see what models are available: @@ -98,7 +123,7 @@ Use explicit config when: - Ollama runs on another host/port. - You want to force specific context windows or model lists. -- You want to include models that do not report tool support. +- You want fully manual model definitions. ```json5 { @@ -170,7 +195,7 @@ Once configured, all your Ollama models are available: ### Reasoning models -OpenClaw marks models as reasoning-capable when Ollama reports `thinking` in `/api/show`: +OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default: ```bash ollama pull deepseek-r1:32b @@ -230,7 +255,7 @@ When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options. ### Context windows -For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it defaults to `8192`. You can override `contextWindow` and `maxTokens` in explicit provider config. +For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it falls back to the default Ollama context window used by OpenClaw. You can override `contextWindow` and `maxTokens` in explicit provider config. ## Troubleshooting @@ -250,16 +275,17 @@ curl http://localhost:11434/api/tags ### No models available -OpenClaw only auto-discovers models that report tool support. If your model isn't listed, either: +If your model is not listed, either: -- Pull a tool-capable model, or +- Pull the model locally, or - Define the model explicitly in `models.providers.ollama`. To add models: ```bash ollama list # See what's installed -ollama pull gpt-oss:20b # Pull a tool-capable model +ollama pull glm-4.7-flash +ollama pull gpt-oss:20b ollama pull llama3.3 # Or another model ```