feat: LM Studio Integration (#53248)

* Feat: LM Studio Integration * Format * Support usage in streaming true Fix token count * Add custom window check * Drop max tokens fallback * tweak docs Update generated * Avoid error if stale header does not resolve * Fix test * Fix test * Fix rebase issues Trim code * Fix tests Drop keyless Fixes * Fix linter issues in tests * Update generated artifacts * Do not have fatal header resoltuion for discovery * Do the same for API key as well * fix: honor lmstudio preload runtime auth * fix: clear stale lmstudio header auth * fix: lazy-load lmstudio runtime facade * fix: preserve lmstudio shared synthetic auth * fix: clear stale lmstudio header auth in discovery * fix: prefer lmstudio header auth for discovery * fix: honor lmstudio header auth in warmup paths * fix: clear stale lmstudio profile auth * fix: ignore lmstudio env auth on header migration * fix: use local lmstudio setup seam * fix: resolve lmstudio rebase fallout --------- Co-authored-by: Frank Yang <frank.ekn@gmail.com>
2026-05-06 12:30:44 +00:00 · 2026-04-13 03:22:44 -04:00
parent 5b92dbaeee
commit 0cfb83edfa
62 changed files with 5811 additions and 37 deletions
--- a/docs/.generated/config-baseline.sha256
+++ b/docs/.generated/config-baseline.sha256
@@ -1,4 +1,4 @@
-5f7ad1520f965f8b4b59b8f3e9733757d4b996ea5dfa40aca279dceeafb8aed7  config-baseline.json
-9bf857e53f27d22eb4d8b22e6407e31c260c797047fdca07b5d95498a712662c  config-baseline.core.json
-3bb312dc9c39a374ca92613abf21606c25dc571287a3941dac71ff57b2b5c519  config-baseline.channel.json
-aa4b1d3d04ed9f9feea73c8fca36c48a54749853e07fadfca54773171b2ef4ff  config-baseline.plugin.json
+8ae6f2aaa659fa6008b05deb09240c1d261830b151b15664dea9834f3b99c4ed  config-baseline.json
+d5f53e95eec6332d59889858d6898dddd8a73a5e4cabe22fc49d893a8e15d6a3  config-baseline.core.json
+e1f94346a8507ce3dec763b598e79f3bb89ff2e33189ce977cc87d3b05e71c1d  config-baseline.channel.json
+2aaeb7a54022481b17ee2b460bce08f4933f1f5301f17cdb8a513cef8a15f667  config-baseline.plugin.json
--- a/docs/.generated/plugin-sdk-api-baseline.sha256
+++ b/docs/.generated/plugin-sdk-api-baseline.sha256
@@ -1,2 +1,2 @@
-ec0d47ca6df1d840719e6692a43cd2187603dc690fb0e8887fde760a4273b1c8  plugin-sdk-api-baseline.json
-c0fc79136e9e90978feb613dc100ef17144cfa1c8451612f8e9a0583f7b7d902  plugin-sdk-api-baseline.jsonl
+4fcfbafe5aadb6d1f170de50f0897ac35c13a5a5bf425a893d5ff94fae3a6c5f  plugin-sdk-api-baseline.json
+994b6e32f8f48c7f16b581e9533e1f2a5b03ce8fa0cce75a2ea0d4543a275f7a  plugin-sdk-api-baseline.jsonl
--- a/docs/cli/onboard.md
+++ b/docs/cli/onboard.md
@@ -43,6 +43,17 @@ openclaw onboard --non-interactive \

 `--custom-api-key` is optional in non-interactive mode. If omitted, onboarding checks `CUSTOM_API_KEY`.

+LM Studio also supports a provider-specific key flag in non-interactive mode:
+
+```bash
+openclaw onboard --non-interactive \
+  --auth-choice lmstudio \
+  --custom-base-url "http://localhost:1234/v1" \
+  --custom-model-id "qwen/qwen3.5-9b" \
+  --lmstudio-api-key "$LM_API_TOKEN" \
+  --accept-risk
+```
+
 Non-interactive Ollama:

 ```bash
--- a/docs/concepts/model-providers.md
+++ b/docs/concepts/model-providers.md
@@ -672,6 +672,28 @@ Plugin-owned capability split:
 - Image understanding is plugin-owned `MiniMax-VL-01` on both MiniMax auth paths
 - Web search stays on provider id `minimax`

+### LM Studio
+
+LM Studio ships as a bundled provider plugin which uses the native API:
+
+- Provider: `lmstudio`
+- Auth: `LM_API_TOKEN`
+- Default inference base URL: `http://localhost:1234/v1`
+
+Then set a model (replace with one of the IDs returned by `http://localhost:1234/api/v1/models`):
+
+```json5
+{
+  agents: {
+    defaults: { model: { primary: "lmstudio/openai/gpt-oss-20b" } },
+  },
+}
+```
+
+OpenClaw uses LM Studio's native `/api/v1/models` and `/api/v1/models/load`
+for discovery + auto-load, with `/v1/chat/completions` for inference by default.
+See [/providers/lmstudio](/providers/lmstudio) for setup and troubleshooting.
+
 ### Ollama

 Ollama ships as a bundled provider plugin and uses Ollama's native API:
@@ -770,7 +792,7 @@ Example (OpenAI‑compatible):
    providers: {
      lmstudio: {
        baseUrl: "http://localhost:1234/v1",
-        apiKey: "LMSTUDIO_KEY",
+        apiKey: "${LM_API_TOKEN}",
        api: "openai-completions",
        models: [
          {
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -1264,6 +1264,7 @@
                  "providers/inferrs",
                  "providers/kilocode",
                  "providers/litellm",
+                  "providers/lmstudio",
                  "providers/minimax",
                  "providers/mistral",
                  "providers/moonshot",
--- a/docs/gateway/local-models.md
+++ b/docs/gateway/local-models.md
@@ -11,7 +11,7 @@ title: "Local Models"

 Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).

-If you want the lowest-friction local setup, start with [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.
+If you want the lowest-friction local setup, start with [LM Studio](/providers/lmstudio) or [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.

 ## Recommended: LM Studio + large local model (Responses API)

--- a/docs/providers/index.md
+++ b/docs/providers/index.md
@@ -45,6 +45,7 @@ Looking for chat channel docs (WhatsApp/Telegram/Discord/Slack/Mattermost (plugi
 - [inferrs (local models)](/providers/inferrs)
 - [Kilocode](/providers/kilocode)
 - [LiteLLM (unified gateway)](/providers/litellm)
+- [LM Studio (local models)](/providers/lmstudio)
 - [MiniMax](/providers/minimax)
 - [Mistral](/providers/mistral)
 - [Moonshot AI (Kimi + Kimi Coding)](/providers/moonshot)
--- a/docs/providers/lmstudio.md
+++ b/docs/providers/lmstudio.md
@@ -0,0 +1,159 @@
+---
+summary: "Run OpenClaw with LM Studio"
+read_when:
+  - You want to run OpenClaw with open source models via LM Studio
+  - You want to set up and configure LM Studio
+title: "LM Studio"
+---
+
+# LM Studio
+
+LM Studio is a friendly yet powerful app for running open-weight models on your own hardware. It lets you run llama.cpp (GGUF) or MLX models (Apple Silicon). Comes in a GUI package or headless daemon (`llmster`). For product and setup docs, see [lmstudio.ai](https://lmstudio.ai/).
+
+## Quick start
+
+1. Install LM Studio (desktop) or `llmster` (headless), then start the local server:
+
+```bash
+curl -fsSL https://lmstudio.ai/install.sh | bash
+```
+
+2. Start the server
+
+Make sure you either start the desktop app or run the daemon using the following command:
+
+```bash
+lms daemon up
+```
+
+```bash
+lms server start --port 1234
+```
+
+If you are using the app, make sure you have JIT enabled for a smooth experience. Learn more in the [LM Studio JIT and TTL guide](https://lmstudio.ai/docs/developer/core/ttl-and-auto-evict).
+
+3. OpenClaw requires an LM Studio token value. Set `LM_API_TOKEN`:
+
+```bash
+export LM_API_TOKEN="your-lm-studio-api-token"
+```
+
+If LM Studio authentication is disabled, use any non-empty token value:
+
+```bash
+export LM_API_TOKEN="placeholder-key"
+```
+
+For LM Studio auth setup details, see [LM Studio Authentication](https://lmstudio.ai/docs/developer/core/authentication).
+
+4. Run onboarding and choose `LM Studio`:
+
+```bash
+openclaw onboard
+```
+
+5. In onboarding, use the `Default model` prompt to pick your LM Studio model.
+
+You can also set or change it later:
+
+```bash
+openclaw models set lmstudio/qwen/qwen3.5-9b
+```
+
+LM Studio model keys follow a `author/model-name` format (e.g. `qwen/qwen3.5-9b`). OpenClaw
+model refs prepend the provider name: `lmstudio/qwen/qwen3.5-9b`. You can find the exact key for
+a model by running `curl http://localhost:1234/api/v1/models` and looking at the `key` field.
+
+## Non-interactive onboarding
+
+Use non-interactive onboarding when you want to script setup (CI, provisioning, remote bootstrap):
+
+```bash
+openclaw onboard \
+  --non-interactive \
+  --accept-risk \
+  --auth-choice lmstudio
+```
+
+Or specify base URL or model with API key:
+
+```bash
+openclaw onboard \
+  --non-interactive \
+  --accept-risk \
+  --auth-choice lmstudio \
+  --custom-base-url http://localhost:1234/v1 \
+  --lmstudio-api-key "$LM_API_TOKEN" \
+  --custom-model-id qwen/qwen3.5-9b
+```
+
+`--custom-model-id` takes the model key as returned by LM Studio (e.g. `qwen/qwen3.5-9b`), without
+the `lmstudio/` provider prefix.
+
+Non-interactive onboarding requires `--lmstudio-api-key` (or `LM_API_TOKEN` in env).
+For unauthenticated LM Studio servers, any non-empty token value works.
+
+`--custom-api-key` remains supported for compatibility, but `--lmstudio-api-key` is preferred for LM Studio.
+
+This writes `models.providers.lmstudio`, sets the default model to
+`lmstudio/<custom-model-id>`, and writes the `lmstudio:default` auth profile.
+
+Interactive setup can prompt for an optional preferred load context length and applies it across the discovered LM Studio models it saves into config.
+
+## Configuration
+
+### Explicit configuration
+
+```json5
+{
+  models: {
+    providers: {
+      lmstudio: {
+        baseUrl: "http://localhost:1234/v1",
+        apiKey: "${LM_API_TOKEN}",
+        api: "openai-completions",
+        models: [
+          {
+            id: "qwen/qwen3-coder-next",
+            name: "Qwen 3 Coder Next",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 128000,
+            maxTokens: 8192,
+          },
+        ],
+      },
+    },
+  },
+}
+```
+
+## Troubleshooting
+
+### LM Studio not detected
+
+Make sure LM Studio is running and that you set `LM_API_TOKEN` (for unauthenticated servers, any non-empty token value works):
+
+```bash
+# Start via desktop app, or headless:
+lms server start --port 1234
+```
+
+Verify the API is accessible:
+
+```bash
+curl http://localhost:1234/api/v1/models
+```
+
+### Authentication errors (HTTP 401)
+
+If setup reports HTTP 401, verify your API key:
+
+- Check that `LM_API_TOKEN` matches the key configured in LM Studio.
+- For LM Studio auth setup details, see [LM Studio Authentication](https://lmstudio.ai/docs/developer/core/authentication).
+- If your server does not require authentication, use any non-empty token value for `LM_API_TOKEN`.
+
+### Just-in-time model loading
+
+LM Studio supports just-in-time (JIT) model loading, where models are loaded on first request. Make sure you have this enabled to avoid 'Model not loaded' errors.
--- a/docs/reference/api-usage-costs.md
+++ b/docs/reference/api-usage-costs.md
@@ -113,6 +113,7 @@ Semantic memory search uses **embedding APIs** when configured for remote provid
 - `memorySearch.provider = "gemini"` → Gemini embeddings
 - `memorySearch.provider = "voyage"` → Voyage embeddings
 - `memorySearch.provider = "mistral"` → Mistral embeddings
+- `memorySearch.provider = "lmstudio"` → LM Studio embeddings (local/self-hosted)
 - `memorySearch.provider = "ollama"` → Ollama embeddings (local/self-hosted; typically no hosted API billing)
 - Optional fallback to a remote provider if local embeddings fail