mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 12:30:44 +00:00
feat: LM Studio Integration (#53248)
* Feat: LM Studio Integration * Format * Support usage in streaming true Fix token count * Add custom window check * Drop max tokens fallback * tweak docs Update generated * Avoid error if stale header does not resolve * Fix test * Fix test * Fix rebase issues Trim code * Fix tests Drop keyless Fixes * Fix linter issues in tests * Update generated artifacts * Do not have fatal header resoltuion for discovery * Do the same for API key as well * fix: honor lmstudio preload runtime auth * fix: clear stale lmstudio header auth * fix: lazy-load lmstudio runtime facade * fix: preserve lmstudio shared synthetic auth * fix: clear stale lmstudio header auth in discovery * fix: prefer lmstudio header auth for discovery * fix: honor lmstudio header auth in warmup paths * fix: clear stale lmstudio profile auth * fix: ignore lmstudio env auth on header migration * fix: use local lmstudio setup seam * fix: resolve lmstudio rebase fallout --------- Co-authored-by: Frank Yang <frank.ekn@gmail.com>
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
5f7ad1520f965f8b4b59b8f3e9733757d4b996ea5dfa40aca279dceeafb8aed7 config-baseline.json
|
||||
9bf857e53f27d22eb4d8b22e6407e31c260c797047fdca07b5d95498a712662c config-baseline.core.json
|
||||
3bb312dc9c39a374ca92613abf21606c25dc571287a3941dac71ff57b2b5c519 config-baseline.channel.json
|
||||
aa4b1d3d04ed9f9feea73c8fca36c48a54749853e07fadfca54773171b2ef4ff config-baseline.plugin.json
|
||||
8ae6f2aaa659fa6008b05deb09240c1d261830b151b15664dea9834f3b99c4ed config-baseline.json
|
||||
d5f53e95eec6332d59889858d6898dddd8a73a5e4cabe22fc49d893a8e15d6a3 config-baseline.core.json
|
||||
e1f94346a8507ce3dec763b598e79f3bb89ff2e33189ce977cc87d3b05e71c1d config-baseline.channel.json
|
||||
2aaeb7a54022481b17ee2b460bce08f4933f1f5301f17cdb8a513cef8a15f667 config-baseline.plugin.json
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
ec0d47ca6df1d840719e6692a43cd2187603dc690fb0e8887fde760a4273b1c8 plugin-sdk-api-baseline.json
|
||||
c0fc79136e9e90978feb613dc100ef17144cfa1c8451612f8e9a0583f7b7d902 plugin-sdk-api-baseline.jsonl
|
||||
4fcfbafe5aadb6d1f170de50f0897ac35c13a5a5bf425a893d5ff94fae3a6c5f plugin-sdk-api-baseline.json
|
||||
994b6e32f8f48c7f16b581e9533e1f2a5b03ce8fa0cce75a2ea0d4543a275f7a plugin-sdk-api-baseline.jsonl
|
||||
|
||||
@@ -43,6 +43,17 @@ openclaw onboard --non-interactive \
|
||||
|
||||
`--custom-api-key` is optional in non-interactive mode. If omitted, onboarding checks `CUSTOM_API_KEY`.
|
||||
|
||||
LM Studio also supports a provider-specific key flag in non-interactive mode:
|
||||
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--auth-choice lmstudio \
|
||||
--custom-base-url "http://localhost:1234/v1" \
|
||||
--custom-model-id "qwen/qwen3.5-9b" \
|
||||
--lmstudio-api-key "$LM_API_TOKEN" \
|
||||
--accept-risk
|
||||
```
|
||||
|
||||
Non-interactive Ollama:
|
||||
|
||||
```bash
|
||||
|
||||
@@ -672,6 +672,28 @@ Plugin-owned capability split:
|
||||
- Image understanding is plugin-owned `MiniMax-VL-01` on both MiniMax auth paths
|
||||
- Web search stays on provider id `minimax`
|
||||
|
||||
### LM Studio
|
||||
|
||||
LM Studio ships as a bundled provider plugin which uses the native API:
|
||||
|
||||
- Provider: `lmstudio`
|
||||
- Auth: `LM_API_TOKEN`
|
||||
- Default inference base URL: `http://localhost:1234/v1`
|
||||
|
||||
Then set a model (replace with one of the IDs returned by `http://localhost:1234/api/v1/models`):
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: { model: { primary: "lmstudio/openai/gpt-oss-20b" } },
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
OpenClaw uses LM Studio's native `/api/v1/models` and `/api/v1/models/load`
|
||||
for discovery + auto-load, with `/v1/chat/completions` for inference by default.
|
||||
See [/providers/lmstudio](/providers/lmstudio) for setup and troubleshooting.
|
||||
|
||||
### Ollama
|
||||
|
||||
Ollama ships as a bundled provider plugin and uses Ollama's native API:
|
||||
@@ -770,7 +792,7 @@ Example (OpenAI‑compatible):
|
||||
providers: {
|
||||
lmstudio: {
|
||||
baseUrl: "http://localhost:1234/v1",
|
||||
apiKey: "LMSTUDIO_KEY",
|
||||
apiKey: "${LM_API_TOKEN}",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
|
||||
@@ -1264,6 +1264,7 @@
|
||||
"providers/inferrs",
|
||||
"providers/kilocode",
|
||||
"providers/litellm",
|
||||
"providers/lmstudio",
|
||||
"providers/minimax",
|
||||
"providers/mistral",
|
||||
"providers/moonshot",
|
||||
|
||||
@@ -11,7 +11,7 @@ title: "Local Models"
|
||||
|
||||
Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).
|
||||
|
||||
If you want the lowest-friction local setup, start with [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.
|
||||
If you want the lowest-friction local setup, start with [LM Studio](/providers/lmstudio) or [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.
|
||||
|
||||
## Recommended: LM Studio + large local model (Responses API)
|
||||
|
||||
|
||||
@@ -45,6 +45,7 @@ Looking for chat channel docs (WhatsApp/Telegram/Discord/Slack/Mattermost (plugi
|
||||
- [inferrs (local models)](/providers/inferrs)
|
||||
- [Kilocode](/providers/kilocode)
|
||||
- [LiteLLM (unified gateway)](/providers/litellm)
|
||||
- [LM Studio (local models)](/providers/lmstudio)
|
||||
- [MiniMax](/providers/minimax)
|
||||
- [Mistral](/providers/mistral)
|
||||
- [Moonshot AI (Kimi + Kimi Coding)](/providers/moonshot)
|
||||
|
||||
159
docs/providers/lmstudio.md
Normal file
159
docs/providers/lmstudio.md
Normal file
@@ -0,0 +1,159 @@
|
||||
---
|
||||
summary: "Run OpenClaw with LM Studio"
|
||||
read_when:
|
||||
- You want to run OpenClaw with open source models via LM Studio
|
||||
- You want to set up and configure LM Studio
|
||||
title: "LM Studio"
|
||||
---
|
||||
|
||||
# LM Studio
|
||||
|
||||
LM Studio is a friendly yet powerful app for running open-weight models on your own hardware. It lets you run llama.cpp (GGUF) or MLX models (Apple Silicon). Comes in a GUI package or headless daemon (`llmster`). For product and setup docs, see [lmstudio.ai](https://lmstudio.ai/).
|
||||
|
||||
## Quick start
|
||||
|
||||
1. Install LM Studio (desktop) or `llmster` (headless), then start the local server:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://lmstudio.ai/install.sh | bash
|
||||
```
|
||||
|
||||
2. Start the server
|
||||
|
||||
Make sure you either start the desktop app or run the daemon using the following command:
|
||||
|
||||
```bash
|
||||
lms daemon up
|
||||
```
|
||||
|
||||
```bash
|
||||
lms server start --port 1234
|
||||
```
|
||||
|
||||
If you are using the app, make sure you have JIT enabled for a smooth experience. Learn more in the [LM Studio JIT and TTL guide](https://lmstudio.ai/docs/developer/core/ttl-and-auto-evict).
|
||||
|
||||
3. OpenClaw requires an LM Studio token value. Set `LM_API_TOKEN`:
|
||||
|
||||
```bash
|
||||
export LM_API_TOKEN="your-lm-studio-api-token"
|
||||
```
|
||||
|
||||
If LM Studio authentication is disabled, use any non-empty token value:
|
||||
|
||||
```bash
|
||||
export LM_API_TOKEN="placeholder-key"
|
||||
```
|
||||
|
||||
For LM Studio auth setup details, see [LM Studio Authentication](https://lmstudio.ai/docs/developer/core/authentication).
|
||||
|
||||
4. Run onboarding and choose `LM Studio`:
|
||||
|
||||
```bash
|
||||
openclaw onboard
|
||||
```
|
||||
|
||||
5. In onboarding, use the `Default model` prompt to pick your LM Studio model.
|
||||
|
||||
You can also set or change it later:
|
||||
|
||||
```bash
|
||||
openclaw models set lmstudio/qwen/qwen3.5-9b
|
||||
```
|
||||
|
||||
LM Studio model keys follow a `author/model-name` format (e.g. `qwen/qwen3.5-9b`). OpenClaw
|
||||
model refs prepend the provider name: `lmstudio/qwen/qwen3.5-9b`. You can find the exact key for
|
||||
a model by running `curl http://localhost:1234/api/v1/models` and looking at the `key` field.
|
||||
|
||||
## Non-interactive onboarding
|
||||
|
||||
Use non-interactive onboarding when you want to script setup (CI, provisioning, remote bootstrap):
|
||||
|
||||
```bash
|
||||
openclaw onboard \
|
||||
--non-interactive \
|
||||
--accept-risk \
|
||||
--auth-choice lmstudio
|
||||
```
|
||||
|
||||
Or specify base URL or model with API key:
|
||||
|
||||
```bash
|
||||
openclaw onboard \
|
||||
--non-interactive \
|
||||
--accept-risk \
|
||||
--auth-choice lmstudio \
|
||||
--custom-base-url http://localhost:1234/v1 \
|
||||
--lmstudio-api-key "$LM_API_TOKEN" \
|
||||
--custom-model-id qwen/qwen3.5-9b
|
||||
```
|
||||
|
||||
`--custom-model-id` takes the model key as returned by LM Studio (e.g. `qwen/qwen3.5-9b`), without
|
||||
the `lmstudio/` provider prefix.
|
||||
|
||||
Non-interactive onboarding requires `--lmstudio-api-key` (or `LM_API_TOKEN` in env).
|
||||
For unauthenticated LM Studio servers, any non-empty token value works.
|
||||
|
||||
`--custom-api-key` remains supported for compatibility, but `--lmstudio-api-key` is preferred for LM Studio.
|
||||
|
||||
This writes `models.providers.lmstudio`, sets the default model to
|
||||
`lmstudio/<custom-model-id>`, and writes the `lmstudio:default` auth profile.
|
||||
|
||||
Interactive setup can prompt for an optional preferred load context length and applies it across the discovered LM Studio models it saves into config.
|
||||
|
||||
## Configuration
|
||||
|
||||
### Explicit configuration
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
lmstudio: {
|
||||
baseUrl: "http://localhost:1234/v1",
|
||||
apiKey: "${LM_API_TOKEN}",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
id: "qwen/qwen3-coder-next",
|
||||
name: "Qwen 3 Coder Next",
|
||||
reasoning: false,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 128000,
|
||||
maxTokens: 8192,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### LM Studio not detected
|
||||
|
||||
Make sure LM Studio is running and that you set `LM_API_TOKEN` (for unauthenticated servers, any non-empty token value works):
|
||||
|
||||
```bash
|
||||
# Start via desktop app, or headless:
|
||||
lms server start --port 1234
|
||||
```
|
||||
|
||||
Verify the API is accessible:
|
||||
|
||||
```bash
|
||||
curl http://localhost:1234/api/v1/models
|
||||
```
|
||||
|
||||
### Authentication errors (HTTP 401)
|
||||
|
||||
If setup reports HTTP 401, verify your API key:
|
||||
|
||||
- Check that `LM_API_TOKEN` matches the key configured in LM Studio.
|
||||
- For LM Studio auth setup details, see [LM Studio Authentication](https://lmstudio.ai/docs/developer/core/authentication).
|
||||
- If your server does not require authentication, use any non-empty token value for `LM_API_TOKEN`.
|
||||
|
||||
### Just-in-time model loading
|
||||
|
||||
LM Studio supports just-in-time (JIT) model loading, where models are loaded on first request. Make sure you have this enabled to avoid 'Model not loaded' errors.
|
||||
@@ -113,6 +113,7 @@ Semantic memory search uses **embedding APIs** when configured for remote provid
|
||||
- `memorySearch.provider = "gemini"` → Gemini embeddings
|
||||
- `memorySearch.provider = "voyage"` → Voyage embeddings
|
||||
- `memorySearch.provider = "mistral"` → Mistral embeddings
|
||||
- `memorySearch.provider = "lmstudio"` → LM Studio embeddings (local/self-hosted)
|
||||
- `memorySearch.provider = "ollama"` → Ollama embeddings (local/self-hosted; typically no hosted API billing)
|
||||
- Optional fallback to a remote provider if local embeddings fail
|
||||
|
||||
|
||||
Reference in New Issue
Block a user