fix(memory): skip qmd vectors in lexical mode

2026-05-06 18:30:44 +00:00 · 2026-04-27 14:09:32 +01:00
parent 6a0dc3a9bc
commit b181930c23
6 changed files with 94 additions and 17 deletions
--- a/docs/cli/memory.md
+++ b/docs/cli/memory.md
@@ -51,7 +51,7 @@ openclaw memory index --agent main --verbose

 `memory status`:

- `--deep`: probe vector + embedding availability. Plain `memory status` stays fast and does not run a live embedding ping.
+- `--deep`: probe vector + embedding availability. Plain `memory status` stays fast and does not run a live embedding ping. QMD lexical `searchMode: "search"` skips semantic vector probes and embedding maintenance even with `--deep`.
 - `--index`: run a reindex if the store is dirty (implies `--deep`).
 - `--fix`: repair stale recall locks and normalize promotion metadata.
 - `--json`: print JSON output.
--- a/docs/concepts/memory-qmd.md
+++ b/docs/concepts/memory-qmd.md
@@ -51,13 +51,15 @@ present.
 ## How the sidecar works

 - OpenClaw creates collections from your workspace memory files and any
-  configured `memory.qmd.paths`, then runs `qmd update` + `qmd embed` on boot
-  and periodically (default every 5 minutes).
+  configured `memory.qmd.paths`, then runs `qmd update` on boot and
+  periodically (default every 5 minutes). Semantic modes also run `qmd embed`.
 - The default workspace collection tracks `MEMORY.md` plus the `memory/`
  tree. Lowercase `memory.md` is not indexed as a root memory file.
 - Boot refresh runs in the background so chat startup is not blocked.
 - Searches use the configured `searchMode` (default: `search`; also supports
-  `vsearch` and `query`). If a mode fails, OpenClaw retries with `qmd query`.
+  `vsearch` and `query`). `search` is BM25-only, so OpenClaw skips semantic
+  vector readiness probes and embedding maintenance in that mode. If a mode
+  fails, OpenClaw retries with `qmd query`.
 - If QMD fails entirely, OpenClaw falls back to the builtin SQLite engine.

 <Info>
@@ -164,6 +166,11 @@ runs as a service, create a symlink:
 **First search very slow?** QMD downloads GGUF models on first use. Pre-warm
 with `qmd query "test"` using the same XDG dirs OpenClaw uses.

+**BM25-only QMD still trying to build llama.cpp?** Set
+`memory.qmd.searchMode = "search"`. OpenClaw treats that mode as lexical-only,
+does not run QMD vector status probes or embedding maintenance, and leaves
+semantic readiness checks to `vsearch` or `query` setups.
+
 **Search times out?** Increase `memory.qmd.limits.timeoutMs` (default: 4000ms).
 Set to `120000` for slower hardware.

--- a/docs/reference/memory-config.md
+++ b/docs/reference/memory-config.md
@@ -449,6 +449,8 @@ Set `memory.backend = "qmd"` to enable. All QMD settings live under `memory.qmd`
 | `sessions.retentionDays` | `number`  | --       | Transcript retention                         |
 | `sessions.exportDir`     | `string`  | --       | Export directory                             |

+`searchMode: "search"` is lexical/BM25-only. OpenClaw does not run semantic vector readiness probes or QMD embedding maintenance for that mode, including during `memory status --deep`; `vsearch` and `query` continue to require QMD vector readiness and embeddings.
+
 OpenClaw prefers the current QMD collection and MCP query shapes, but keeps older QMD releases working by falling back to legacy `--mask` collection flags and older MCP tool names when needed.

 <Note>