fix: retain local memory runtime deps

2026-05-06 15:10:52 +00:00 · 2026-04-30 15:22:26 +01:00
parent 9d037d2f5a
commit ac599c9e53
8 changed files with 155 additions and 16 deletions
--- a/docs/concepts/memory-search.md
+++ b/docs/concepts/memory-search.md
@@ -33,8 +33,9 @@ For multi-endpoint setups, `provider` can also be a custom
 `models.providers.<id>` entry, such as `ollama-5080`, when that provider sets
 `api: "ollama"` or another embedding adapter owner.

-For local embeddings with no API key, install the optional `node-llama-cpp`
-runtime package next to OpenClaw and use `provider: "local"`.
+For local embeddings with no API key, set `provider: "local"`. Packaged
+installs retain the native `node-llama-cpp` runtime in OpenClaw's managed plugin
+runtime-deps tree; run `openclaw doctor --fix` if that tree needs repair.

 Some OpenAI-compatible embedding endpoints require asymmetric labels such as
 `input_type: "query"` for searches and `input_type: "document"` or `"passage"`
--- a/docs/reference/memory-config.md
+++ b/docs/reference/memory-config.md
@@ -284,7 +284,7 @@ For custom OpenAI-compatible endpoints or overriding provider defaults:
    | `local.modelCacheDir` | `string`           | node-llama-cpp default | Cache dir for downloaded models                                                                                                                                                                                                                                                                                      |
    | `local.contextSize`   | `number \| "auto"` | `4096`                 | Context window size for the embedding context. 4096 covers typical chunks (128–512 tokens) while bounding non-weight VRAM. Lower to 1024–2048 on constrained hosts. `"auto"` uses the model's trained maximum — not recommended for 8B+ models (Qwen3-Embedding-8B: 40 960 tokens → ~32 GB VRAM vs ~8.8 GB at 4096). |

-    Default model: `embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB, auto-downloaded). Requires native build: `pnpm approve-builds` then `pnpm rebuild node-llama-cpp`.
+    Default model: `embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB, auto-downloaded). Packaged installs repair the native `node-llama-cpp` runtime through managed plugin runtime deps when `provider: "local"` is configured. Source checkouts still require native build approval: `pnpm approve-builds` then `pnpm rebuild node-llama-cpp`.

    Use the standalone CLI to verify the same provider path the Gateway uses: