docs: deep audit of memory section -- fix icons, beef up engine pages, restructure config reference

This commit is contained in:
Vincent Koc
2026-03-30 08:39:06 +09:00
parent 1600c1726e
commit c5baf63fa5
5 changed files with 414 additions and 661 deletions

View File

@@ -14,12 +14,10 @@ a per-agent SQLite database and needs no extra dependencies to get started.
## What it provides
- **Keyword search** via FTS5 full-text indexing (BM25 scoring).
- **Vector search** via embeddings from OpenAI, Gemini, Voyage, Mistral, Ollama,
or a local GGUF model.
- **Vector search** via embeddings from any supported provider.
- **Hybrid search** that combines both for best results.
- **CJK support** via trigram tokenization.
- **sqlite-vec acceleration** for in-database vector queries (optional, falls
back to in-process cosine similarity).
- **CJK support** via trigram tokenization for Chinese, Japanese, and Korean.
- **sqlite-vec acceleration** for in-database vector queries (optional).
## Getting started
@@ -42,24 +40,66 @@ To set a provider explicitly:
Without an embedding provider, only keyword search is available.
## Index location
## Supported embedding providers
The index lives at `~/.openclaw/memory/<agentId>.sqlite`. Reindex anytime with:
| Provider | ID | Auto-detected | Notes |
| -------- | --------- | ------------- | ----------------------------------- |
| OpenAI | `openai` | Yes | Default: `text-embedding-3-small` |
| Gemini | `gemini` | Yes | Supports multimodal (image + audio) |
| Voyage | `voyage` | Yes | |
| Mistral | `mistral` | Yes | |
| Ollama | `ollama` | No | Local, set explicitly |
| Local | `local` | Yes (first) | GGUF model, ~0.6 GB download |
```bash
openclaw memory index --force
```
Auto-detection picks the first provider whose API key can be resolved, in the
order shown. Set `memorySearch.provider` to override.
## How indexing works
OpenClaw indexes `MEMORY.md` and `memory/*.md` into chunks (~400 tokens with
80-token overlap) and stores them in a per-agent SQLite database.
- **Index location:** `~/.openclaw/memory/<agentId>.sqlite`
- **File watching:** changes to memory files trigger a debounced reindex (1.5s).
- **Auto-reindex:** when the embedding provider, model, or chunking config
changes, the entire index is rebuilt automatically.
- **Reindex on demand:** `openclaw memory index --force`
<Info>
You can also index Markdown files outside the workspace with
`memorySearch.extraPaths`. See the
[configuration reference](/reference/memory-config#additional-memory-paths).
</Info>
## When to use
The builtin engine is the right choice for most users. It works out of the box,
has no external dependencies, and handles keyword + vector search well.
The builtin engine is the right choice for most users:
- Works out of the box with no extra dependencies.
- Handles keyword and vector search well.
- Supports all embedding providers.
- Hybrid search combines the best of both retrieval approaches.
Consider switching to [QMD](/concepts/memory-qmd) if you need reranking, query
expansion, or want to index directories outside the workspace.
Consider [Honcho](/concepts/memory-honcho) if you want cross-session memory with
automatic user modeling.
## Troubleshooting
**Memory search disabled?** Check `openclaw memory status`. If no provider is
detected, set one explicitly or add an API key.
**Stale results?** Run `openclaw memory index --force` to rebuild. The watcher
may miss changes in rare edge cases.
**sqlite-vec not loading?** OpenClaw falls back to in-process cosine similarity
automatically. Check logs for the specific load error.
## Configuration
For embedding provider setup, hybrid search tuning (weights, MMR, temporal
decay), batch indexing, multimodal memory, sqlite-vec, and all other config
knobs, see the [Memory configuration reference](/reference/memory-config).
decay), batch indexing, multimodal memory, sqlite-vec, extra paths, and all
other config knobs, see the
[Memory configuration reference](/reference/memory-config).

View File

@@ -86,6 +86,17 @@ Settings live under `plugins.entries["openclaw-honcho"].config`:
For self-hosted instances, point `baseUrl` to your local server (for example
`http://localhost:8000`) and omit the API key.
## Migrating existing memory
If you have existing workspace memory files (`USER.md`, `MEMORY.md`,
`IDENTITY.md`, `memory/`, `canvas/`), `openclaw honcho setup` detects and
offers to migrate them.
<Info>
Migration is non-destructive -- files are uploaded to Honcho. Originals are
never deleted or moved.
</Info>
## How it works
After every AI turn, the conversation is persisted to Honcho. Both user and

View File

@@ -19,7 +19,7 @@ binary, and can index content beyond your workspace memory files.
- **Index session transcripts** -- recall earlier conversations.
- **Fully local** -- runs via Bun + node-llama-cpp, auto-downloads GGUF models.
- **Automatic fallback** -- if QMD is unavailable, OpenClaw falls back to the
builtin engine.
builtin engine seamlessly.
## Getting started
@@ -28,6 +28,7 @@ binary, and can index content beyond your workspace memory files.
- Install QMD: `bun install -g https://github.com/tobi/qmd`
- SQLite build that allows extensions (`brew install sqlite` on macOS).
- QMD must be on the gateway's `PATH`.
- macOS and Linux work out of the box. Windows is best supported via WSL2.
### Enable
@@ -41,7 +42,22 @@ binary, and can index content beyond your workspace memory files.
OpenClaw creates a self-contained QMD home under
`~/.openclaw/agents/<agentId>/qmd/` and manages the sidecar lifecycle
automatically.
automatically -- collections, updates, and embedding runs are handled for you.
## How the sidecar works
- OpenClaw creates collections from your workspace memory files and any
configured `memory.qmd.paths`, then runs `qmd update` + `qmd embed` on boot
and periodically (default every 5 minutes).
- Boot refresh runs in the background so chat startup is not blocked.
- Searches use the configured `searchMode` (default: `search`; also supports
`vsearch` and `query`). If a mode fails, OpenClaw retries with `qmd query`.
- If QMD fails entirely, OpenClaw falls back to the builtin SQLite engine.
<Info>
The first search may be slow -- QMD auto-downloads GGUF models (~2 GB) for
reranking and query expansion on the first `qmd query` run.
</Info>
## Indexing extra paths
@@ -58,6 +74,10 @@ Point QMD at additional directories to make them searchable:
}
```
Snippets from extra paths appear as `qmd/<collection>/<relative-path>` in
search results. `memory_get` understands this prefix and reads from the correct
collection root.
## Indexing session transcripts
Enable session indexing to recall earlier conversations:
@@ -73,7 +93,35 @@ Enable session indexing to recall earlier conversations:
}
```
Transcripts are exported and indexed in a dedicated QMD collection.
Transcripts are exported as sanitized User/Assistant turns into a dedicated QMD
collection under `~/.openclaw/agents/<id>/qmd/sessions/`.
## Search scope
By default, QMD search results are only surfaced in DM sessions (not groups or
channels). Configure `memory.qmd.scope` to change this:
```json5
{
memory: {
qmd: {
scope: {
default: "deny",
rules: [{ action: "allow", match: { chatType: "direct" } }],
},
},
},
}
```
When scope denies a search, OpenClaw logs a warning with the derived channel and
chat type so empty results are easier to debug.
## Citations
When `memory.citations` is `auto` or `on`, search snippets include a
`Source: <path#line>` footer. Set `memory.citations = "off"` to omit the footer
while still passing the path to the agent internally.
## When to use
@@ -87,6 +135,21 @@ Choose QMD when you need:
For simpler setups, the [builtin engine](/concepts/memory-builtin) works well
with no extra dependencies.
## Troubleshooting
**QMD not found?** Ensure the binary is on the gateway's `PATH`. If OpenClaw
runs as a service, create a symlink:
`sudo ln -s ~/.bun/bin/qmd /usr/local/bin/qmd`.
**First search very slow?** QMD downloads GGUF models on first use. Pre-warm
with `qmd query "test"` using the same XDG dirs OpenClaw uses.
**Search times out?** Increase `memory.qmd.limits.timeoutMs` (default: 4000ms).
Set to `120000` for slower hardware.
**Empty results in group chats?** Check `memory.qmd.scope` -- the default only
allows DM sessions.
## Configuration
For the full config surface (`memory.qmd.*`), search modes, update intervals,

View File

@@ -1,12 +1,12 @@
---
title: "Memory"
title: "Memory Overview"
summary: "How OpenClaw remembers things across sessions"
read_when:
- You want to understand how memory works
- You want to know what memory files to write
---
# Memory
# Memory Overview
OpenClaw remembers things by writing **plain Markdown files** in your agent's
workspace. The model only "remembers" what gets saved to disk -- there is no
@@ -61,7 +61,7 @@ For details on how search works, tuning options, and provider setup, see
SQLite-based. Works out of the box with keyword search, vector similarity, and
hybrid search. No extra dependencies.
</Card>
<Card title="QMD" icon="magnifying-glass" href="/concepts/memory-qmd">
<Card title="QMD" icon="search" href="/concepts/memory-qmd">
Local-first sidecar with reranking, query expansion, and the ability to index
directories outside the workspace.
</Card>

View File

@@ -1,6 +1,6 @@
---
title: "Memory configuration reference"
summary: "Full configuration reference for OpenClaw memory search, embedding providers, QMD backend, hybrid search, and multimodal memory"
summary: "All configuration knobs for memory search, embedding providers, QMD, hybrid search, and multimodal indexing"
read_when:
- You want to configure memory search providers or embedding models
- You want to set up the QMD backend
@@ -10,711 +10,350 @@ read_when:
# Memory configuration reference
This page covers the full configuration surface for OpenClaw memory search. For
the conceptual overview, see [Memory](/concepts/memory). For how the search
pipeline works (hybrid search, MMR, temporal decay), see
[Memory Search](/concepts/memory-search).
This page lists every configuration knob for OpenClaw memory search. For
conceptual overviews, see:
## Memory search defaults
- [Memory Overview](/concepts/memory) -- how memory works
- [Builtin Engine](/concepts/memory-builtin) -- default SQLite backend
- [QMD Engine](/concepts/memory-qmd) -- local-first sidecar
- [Memory Search](/concepts/memory-search) -- search pipeline and tuning
- Enabled by default.
- Watches memory files for changes (debounced).
- Configure memory search under `agents.defaults.memorySearch` (not top-level
`memorySearch`).
- `memorySearch.provider` and `memorySearch.fallback` accept **adapter ids**
registered by the active memory plugin.
- The default `memory-core` plugin registers these built-in adapter ids:
`local`, `openai`, `gemini`, `voyage`, `mistral`, and `ollama`.
- With the default `memory-core` plugin, if `memorySearch.provider` is not set,
OpenClaw auto-selects:
1. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
2. `openai` if an OpenAI key can be resolved.
3. `gemini` if a Gemini key can be resolved.
4. `voyage` if a Voyage key can be resolved.
5. `mistral` if a Mistral key can be resolved.
6. Otherwise memory search stays disabled until configured.
- Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
- Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
- With the default `memory-core` plugin, `memorySearch.provider = "ollama"` is
also supported for local/self-hosted Ollama embeddings (`/api/embeddings`),
but it is not auto-selected.
All memory search settings live under `agents.defaults.memorySearch` in
`openclaw.json` unless noted otherwise.
Remote embeddings **require** an API key for the embedding provider. OpenClaw
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
variables. Codex OAuth only covers chat/completions and does **not** satisfy
embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
`models.providers.google.apiKey`. For Voyage, use `VOYAGE_API_KEY` or
`models.providers.voyage.apiKey`. For Mistral, use `MISTRAL_API_KEY` or
`models.providers.mistral.apiKey`. Ollama typically does not require a real API
key (a placeholder like `OLLAMA_API_KEY=ollama-local` is enough when needed by
local policy).
When using a custom OpenAI-compatible endpoint,
set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
---
## QMD backend
## Provider selection
Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for
[QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines
BM25 + vectors + reranking. Markdown stays the source of truth; OpenClaw shells
out to QMD for retrieval. Key points:
| Key | Type | Default | Description |
| ---------- | --------- | ---------------- | -------------------------------------------------------------------------------- |
| `provider` | `string` | auto-detected | Embedding adapter ID: `openai`, `gemini`, `voyage`, `mistral`, `ollama`, `local` |
| `model` | `string` | provider default | Embedding model name |
| `fallback` | `string` | `"none"` | Fallback adapter ID when the primary fails |
| `enabled` | `boolean` | `true` | Enable or disable memory search |
### Prerequisites
### Auto-detection order
- Disabled by default. Opt in per-config (`memory.backend = "qmd"`).
- Install the QMD CLI separately (`bun install -g https://github.com/tobi/qmd` or grab
a release) and make sure the `qmd` binary is on the gateway's `PATH`.
- QMD needs an SQLite build that allows extensions (`brew install sqlite` on
macOS).
- QMD runs fully locally via Bun + `node-llama-cpp` and auto-downloads GGUF
models from HuggingFace on first use (no separate Ollama daemon required).
- The gateway runs QMD in a self-contained XDG home under
`~/.openclaw/agents/<agentId>/qmd/` by setting `XDG_CONFIG_HOME` and
`XDG_CACHE_HOME`.
- OS support: macOS and Linux work out of the box once Bun + SQLite are
installed. Windows is best supported via WSL2.
When `provider` is not set, OpenClaw selects the first available:
### How the sidecar runs
1. `local` -- if `memorySearch.local.modelPath` is configured and the file exists.
2. `openai` -- if an OpenAI key can be resolved.
3. `gemini` -- if a Gemini key can be resolved.
4. `voyage` -- if a Voyage key can be resolved.
5. `mistral` -- if a Mistral key can be resolved.
- The gateway writes a self-contained QMD home under
`~/.openclaw/agents/<agentId>/qmd/` (config + cache + sqlite DB).
- Collections are created via `qmd collection add` from `memory.qmd.paths`
(plus default workspace memory files), then `qmd update` + `qmd embed` run
on boot and on a configurable interval (`memory.qmd.update.interval`,
default 5 m).
- The gateway now initializes the QMD manager on startup, so periodic update
timers are armed even before the first `memory_search` call.
- Boot refresh now runs in the background by default so chat startup is not
blocked; set `memory.qmd.update.waitForBootSync = true` to keep the previous
blocking behavior.
- Searches run via `memory.qmd.searchMode` (default `qmd search --json`; also
supports `vsearch` and `query`). If the selected mode rejects flags on your
QMD build, OpenClaw retries with `qmd query`. If QMD fails or the binary is
missing, OpenClaw automatically falls back to the builtin SQLite manager so
memory tools keep working.
- OpenClaw does not expose QMD embed batch-size tuning today; batch behavior is
controlled by QMD itself.
- **First search may be slow**: QMD may download local GGUF models (reranker/query
expansion) on the first `qmd query` run.
- OpenClaw sets `XDG_CONFIG_HOME`/`XDG_CACHE_HOME` automatically when it runs QMD.
- If you want to pre-download models manually (and warm the same index OpenClaw
uses), run a one-off query with the agent's XDG dirs.
`ollama` is supported but not auto-detected (set it explicitly).
OpenClaw's QMD state lives under your **state dir** (defaults to `~/.openclaw`).
You can point `qmd` at the exact same index by exporting the same XDG vars
OpenClaw uses:
### API key resolution
```bash
# Pick the same state dir OpenClaw uses
STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
Remote embeddings require an API key. OpenClaw resolves from:
auth profiles, `models.providers.*.apiKey`, or environment variables.
export XDG_CONFIG_HOME="$STATE_DIR/agents/main/qmd/xdg-config"
export XDG_CACHE_HOME="$STATE_DIR/agents/main/qmd/xdg-cache"
| Provider | Env var | Config key |
| -------- | ------------------------------ | --------------------------------- |
| OpenAI | `OPENAI_API_KEY` | `models.providers.openai.apiKey` |
| Gemini | `GEMINI_API_KEY` | `models.providers.google.apiKey` |
| Voyage | `VOYAGE_API_KEY` | `models.providers.voyage.apiKey` |
| Mistral | `MISTRAL_API_KEY` | `models.providers.mistral.apiKey` |
| Ollama | `OLLAMA_API_KEY` (placeholder) | -- |
# (Optional) force an index refresh + embeddings
qmd update
qmd embed
Codex OAuth covers chat/completions only and does not satisfy embedding
requests.
# Warm up / trigger first-time model downloads
qmd query "test" -c memory-root --json >/dev/null 2>&1
```
---
### Config surface (`memory.qmd.*`)
## Remote endpoint config
- `command` (default `qmd`): override the executable path.
- `searchMode` (default `search`): pick which QMD command backs
`memory_search` (`search`, `vsearch`, `query`).
- `includeDefaultMemory` (default `true`): auto-index `MEMORY.md` + `memory/**/*.md`.
- `paths[]`: add extra directories/files (`path`, optional `pattern`, optional
stable `name`).
- `sessions`: opt into session JSONL indexing (`enabled`, `retentionDays`,
`exportDir`).
- `update`: controls refresh cadence and maintenance execution:
(`interval`, `debounceMs`, `onBoot`, `waitForBootSync`, `embedInterval`,
`commandTimeoutMs`, `updateTimeoutMs`, `embedTimeoutMs`).
- `limits`: clamp recall payload (`maxResults`, `maxSnippetChars`,
`maxInjectedChars`, `timeoutMs`).
- `scope`: same schema as [`session.sendPolicy`](/gateway/configuration-reference#session).
Default is DM-only (`deny` all, `allow` direct chats); loosen it to surface QMD
hits in groups/channels.
- `match.keyPrefix` matches the **normalized** session key (lowercased, with any
leading `agent:<id>:` stripped). Example: `discord:channel:`.
- `match.rawKeyPrefix` matches the **raw** session key (lowercased), including
`agent:<id>:`. Example: `agent:main:discord:`.
- Legacy: `match.keyPrefix: "agent:..."` is still treated as a raw-key prefix,
but prefer `rawKeyPrefix` for clarity.
- When `scope` denies a search, OpenClaw logs a warning with the derived
`channel`/`chatType` so empty results are easier to debug.
- Snippets sourced outside the workspace show up as
`qmd/<collection>/<relative-path>` in `memory_search` results; `memory_get`
understands that prefix and reads from the configured QMD collection root.
- When `memory.qmd.sessions.enabled = true`, OpenClaw exports sanitized session
transcripts (User/Assistant turns) into a dedicated QMD collection under
`~/.openclaw/agents/<id>/qmd/sessions/`, so `memory_search` can recall recent
conversations without touching the builtin SQLite index.
- `memory_search` snippets now include a `Source: <path#line>` footer when
`memory.citations` is `auto`/`on`; set `memory.citations = "off"` to keep
the path metadata internal (the agent still receives the path for
`memory_get`, but the snippet text omits the footer and the system prompt
warns the agent not to cite it).
For custom OpenAI-compatible endpoints or overriding provider defaults:
### QMD example
| Key | Type | Description |
| ---------------- | -------- | -------------------------------------------------- |
| `remote.baseUrl` | `string` | Custom API base URL |
| `remote.apiKey` | `string` | Override API key |
| `remote.headers` | `object` | Extra HTTP headers (merged with provider defaults) |
```json5
memory: {
backend: "qmd",
citations: "auto",
qmd: {
includeDefaultMemory: true,
update: { interval: "5m", debounceMs: 15000 },
limits: { maxResults: 6, timeoutMs: 4000 },
scope: {
default: "deny",
rules: [
{ action: "allow", match: { chatType: "direct" } },
// Normalized session-key prefix (strips `agent:<id>:`).
{ action: "deny", match: { keyPrefix: "discord:channel:" } },
// Raw session-key prefix (includes `agent:<id>:`).
{ action: "deny", match: { rawKeyPrefix: "agent:main:discord:" } },
]
{
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://api.example.com/v1/",
apiKey: "YOUR_KEY",
},
},
},
paths: [
{ name: "docs", path: "~/notes", pattern: "**/*.md" }
]
}
},
}
```
### Citations and fallback
---
- `memory.citations` applies regardless of backend (`auto`/`on`/`off`).
- When `qmd` runs, we tag `status().backend = "qmd"` so diagnostics show which
engine served the results. If the QMD subprocess exits or JSON output can't be
parsed, the search manager logs a warning and returns the builtin provider
(existing Markdown embeddings) until QMD recovers.
## Gemini-specific config
| Key | Type | Default | Description |
| ---------------------- | -------- | ---------------------- | ------------------------------------------ |
| `model` | `string` | `gemini-embedding-001` | Also supports `gemini-embedding-2-preview` |
| `outputDimensionality` | `number` | `3072` | For Embedding 2: 768, 1536, or 3072 |
<Warning>
Changing model or `outputDimensionality` triggers an automatic full reindex.
</Warning>
---
## Local embedding config
| Key | Type | Default | Description |
| --------------------- | -------- | ---------------------- | ------------------------------- |
| `local.modelPath` | `string` | auto-downloaded | Path to GGUF model file |
| `local.modelCacheDir` | `string` | node-llama-cpp default | Cache dir for downloaded models |
Default model: `embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB, auto-downloaded).
Requires native build: `pnpm approve-builds` then `pnpm rebuild node-llama-cpp`.
---
## Hybrid search config
All under `memorySearch.query.hybrid`:
| Key | Type | Default | Description |
| --------------------- | --------- | ------- | ---------------------------------- |
| `enabled` | `boolean` | `true` | Enable hybrid BM25 + vector search |
| `vectorWeight` | `number` | `0.7` | Weight for vector scores (0-1) |
| `textWeight` | `number` | `0.3` | Weight for BM25 scores (0-1) |
| `candidateMultiplier` | `number` | `4` | Candidate pool size multiplier |
### MMR (diversity)
| Key | Type | Default | Description |
| ------------- | --------- | ------- | ------------------------------------ |
| `mmr.enabled` | `boolean` | `false` | Enable MMR re-ranking |
| `mmr.lambda` | `number` | `0.7` | 0 = max diversity, 1 = max relevance |
### Temporal decay (recency)
| Key | Type | Default | Description |
| ---------------------------- | --------- | ------- | ------------------------- |
| `temporalDecay.enabled` | `boolean` | `false` | Enable recency boost |
| `temporalDecay.halfLifeDays` | `number` | `30` | Score halves every N days |
Evergreen files (`MEMORY.md`, non-dated files in `memory/`) are never decayed.
### Full example
```json5
{
agents: {
defaults: {
memorySearch: {
query: {
hybrid: {
vectorWeight: 0.7,
textWeight: 0.3,
mmr: { enabled: true, lambda: 0.7 },
temporalDecay: { enabled: true, halfLifeDays: 30 },
},
},
},
},
},
}
```
---
## Additional memory paths
If you want to index Markdown files outside the default workspace layout, add
explicit paths:
| Key | Type | Description |
| ------------ | ---------- | ---------------------------------------- |
| `extraPaths` | `string[]` | Additional directories or files to index |
```json5
agents: {
defaults: {
memorySearch: {
extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"]
}
}
}
```
Notes:
- Paths can be absolute or workspace-relative.
- Directories are scanned recursively for `.md` files.
- By default, only Markdown files are indexed.
- If `memorySearch.multimodal.enabled = true`, OpenClaw also indexes supported image/audio files under `extraPaths` only. Default memory roots (`MEMORY.md`, `memory.md`, `memory/**/*.md`) stay Markdown-only.
- Symlinks are ignored (files or directories).
## Multimodal memory files (Gemini image + audio)
OpenClaw can index image and audio files from `memorySearch.extraPaths` when using Gemini embedding 2:
```json5
agents: {
defaults: {
memorySearch: {
provider: "gemini",
model: "gemini-embedding-2-preview",
extraPaths: ["assets/reference", "voice-notes"],
multimodal: {
enabled: true,
modalities: ["image", "audio"], // or ["all"]
maxFileBytes: 10000000
{
agents: {
defaults: {
memorySearch: {
extraPaths: ["../team-docs", "/srv/shared-notes"],
},
remote: {
apiKey: "YOUR_GEMINI_API_KEY"
}
}
}
},
},
}
```
Notes:
Paths can be absolute or workspace-relative. Directories are scanned
recursively for `.md` files. Symlinks are ignored.
- Multimodal memory is currently supported only for `gemini-embedding-2-preview`.
- Multimodal indexing applies only to files discovered through `memorySearch.extraPaths`.
- Supported modalities in this phase: image and audio.
- `memorySearch.fallback` must stay `"none"` while multimodal memory is enabled.
- Matching image/audio file bytes are uploaded to the configured Gemini embedding endpoint during indexing.
- Supported image extensions: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`, `.heic`, `.heif`.
- Supported audio extensions: `.mp3`, `.wav`, `.ogg`, `.opus`, `.m4a`, `.aac`, `.flac`.
- Search queries remain text, but Gemini can compare those text queries against indexed image/audio embeddings.
- `memory_get` still reads Markdown only; binary files are searchable but not returned as raw file contents.
---
## Gemini embeddings (native)
## Multimodal memory (Gemini)
Set the provider to `gemini` to use the Gemini embeddings API directly:
Index images and audio alongside Markdown using Gemini Embedding 2:
```json5
agents: {
defaults: {
memorySearch: {
provider: "gemini",
model: "gemini-embedding-001",
remote: {
apiKey: "YOUR_GEMINI_API_KEY"
}
}
}
}
```
| Key | Type | Default | Description |
| ------------------------- | ---------- | ---------- | -------------------------------------- |
| `multimodal.enabled` | `boolean` | `false` | Enable multimodal indexing |
| `multimodal.modalities` | `string[]` | -- | `["image"]`, `["audio"]`, or `["all"]` |
| `multimodal.maxFileBytes` | `number` | `10000000` | Max file size for indexing |
Notes:
Only applies to files in `extraPaths`. Default memory roots stay Markdown-only.
Requires `gemini-embedding-2-preview`. `fallback` must be `"none"`.
- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
- `remote.headers` lets you add extra headers if needed.
- Default model: `gemini-embedding-001`.
- `gemini-embedding-2-preview` is also supported: 8192 token limit and configurable dimensions (768 / 1536 / 3072, default 3072).
Supported formats: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`, `.heic`, `.heif`
(images); `.mp3`, `.wav`, `.ogg`, `.opus`, `.m4a`, `.aac`, `.flac` (audio).
### Gemini Embedding 2 (preview)
```json5
agents: {
defaults: {
memorySearch: {
provider: "gemini",
model: "gemini-embedding-2-preview",
outputDimensionality: 3072, // optional: 768, 1536, or 3072 (default)
remote: {
apiKey: "YOUR_GEMINI_API_KEY"
}
}
}
}
```
> **Re-index required:** Switching from `gemini-embedding-001` (768 dimensions)
> to `gemini-embedding-2-preview` (3072 dimensions) changes the vector size. The same is true if you
> change `outputDimensionality` between 768, 1536, and 3072.
> OpenClaw will automatically reindex when it detects a model or dimension change.
## Custom OpenAI-compatible endpoint
If you want to use a custom OpenAI-compatible endpoint (OpenRouter, vLLM, or a proxy),
you can use the `remote` configuration with the OpenAI provider:
```json5
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://api.example.com/v1/",
apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
headers: { "X-Custom-Header": "value" }
}
}
}
}
```
If you don't want to set an API key, use `memorySearch.provider = "local"` or set
`memorySearch.fallback = "none"`.
### Fallbacks
- `memorySearch.fallback` can be any registered memory embedding adapter id, or `none`.
- With the default `memory-core` plugin, valid built-in fallback ids are `openai`, `gemini`, `voyage`, `mistral`, `ollama`, and `local`.
- The fallback provider is only used when the primary embedding provider fails.
### Batch indexing
- Disabled by default. Set `agents.defaults.memorySearch.remote.batch.enabled = true` to enable batch indexing for providers whose adapter exposes batch support.
- Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
- Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
- With the default `memory-core` plugin, batch indexing is available for `openai`, `gemini`, and `voyage`.
- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
Why OpenAI batch is fast and cheap:
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
- See the OpenAI Batch API docs and pricing for details:
- [https://platform.openai.com/docs/api-reference/batch](https://platform.openai.com/docs/api-reference/batch)
- [https://platform.openai.com/pricing](https://platform.openai.com/pricing)
Config example:
```json5
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
fallback: "openai",
remote: {
batch: { enabled: true, concurrency: 2 }
},
sync: { watch: true }
}
}
}
```
## How the memory tools work
- `memory_search` semantically searches Markdown chunks (~400 token target, 80-token overlap) from `MEMORY.md` + `memory/**/*.md`. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local to remote embeddings. No full file payload is returned.
- `memory_get` reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside `MEMORY.md` / `memory/` are rejected.
- Both tools are enabled only when `memorySearch.enabled` resolves true for the agent.
## What gets indexed (and when)
- File type: Markdown only (`MEMORY.md`, `memory/**/*.md`).
- Index storage: per-agent SQLite at `~/.openclaw/memory/<agentId>.sqlite` (configurable via `agents.defaults.memorySearch.store.path`, supports `{agentId}` token).
- Freshness: watcher on `MEMORY.md` + `memory/` marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync.
- Reindex triggers: the index stores the embedding **provider/model + endpoint fingerprint + chunking params**. If any of those change, OpenClaw automatically resets and reindexes the entire store.
## Hybrid search (BM25 + vector)
When enabled, OpenClaw combines:
- **Vector similarity** (semantic match, wording can differ)
- **BM25 keyword relevance** (exact tokens like IDs, env vars, code symbols)
If full-text search is unavailable on your platform, OpenClaw falls back to vector-only search.
### Why hybrid
Vector search is great at "this means the same thing":
- "Mac Studio gateway host" vs "the machine running the gateway"
- "debounce file updates" vs "avoid indexing on every write"
But it can be weak at exact, high-signal tokens:
- IDs (`a828e60`, `b3b9895a...`)
- code symbols (`memorySearch.query.hybrid`)
- error strings ("sqlite-vec unavailable")
BM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases.
Hybrid search is the pragmatic middle ground: **use both retrieval signals** so you get
good results for both "natural language" queries and "needle in a haystack" queries.
### How we merge results (the current design)
Implementation sketch:
1. Retrieve a candidate pool from both sides:
- **Vector**: top `maxResults * candidateMultiplier` by cosine similarity.
- **BM25**: top `maxResults * candidateMultiplier` by FTS5 BM25 rank (lower is better).
2. Convert BM25 rank into a 0..1-ish score:
- `textScore = 1 / (1 + max(0, bm25Rank))`
3. Union candidates by chunk id and compute a weighted score:
- `finalScore = vectorWeight * vectorScore + textWeight * textScore`
Notes:
- `vectorWeight` + `textWeight` is normalized to 1.0 in config resolution, so weights behave as percentages.
- If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.
- If FTS5 can't be created, we keep vector-only search (no hard failure).
- **CJK support**: FTS5 uses configurable trigram tokenization with a short-substring fallback so Chinese, Japanese, and Korean text is searchable without breaking mixed-length queries. CJK-heavy text is also weighted correctly during chunk size estimation, and surrogate-pair characters are preserved during fine splits.
This isn't "IR-theory perfect", but it's simple, fast, and tends to improve recall/precision on real notes.
If we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization
(min/max or z-score) before mixing.
### Post-processing pipeline
After merging vector and keyword scores, two optional post-processing stages
refine the result list before it reaches the agent:
```
Vector + Keyword -> Weighted Merge -> Temporal Decay -> Sort -> MMR -> Top-K Results
```
Both stages are **off by default** and can be enabled independently.
### MMR re-ranking (diversity)
When hybrid search returns results, multiple chunks may contain similar or overlapping content.
For example, searching for "home network setup" might return five nearly identical snippets
from different daily notes that all mention the same router configuration.
**MMR (Maximal Marginal Relevance)** re-ranks the results to balance relevance with diversity,
ensuring the top results cover different aspects of the query instead of repeating the same information.
How it works:
1. Results are scored by their original relevance (vector + BM25 weighted score).
2. MMR iteratively selects results that maximize: `lambda x relevance - (1-lambda) x max_similarity_to_selected`.
3. Similarity between results is measured using Jaccard text similarity on tokenized content.
The `lambda` parameter controls the trade-off:
- `lambda = 1.0` -- pure relevance (no diversity penalty)
- `lambda = 0.0` -- maximum diversity (ignores relevance)
- Default: `0.7` (balanced, slight relevance bias)
**Example -- query: "home network setup"**
Given these memory files:
```
memory/2026-02-10.md -> "Configured Omada router, set VLAN 10 for IoT devices"
memory/2026-02-08.md -> "Configured Omada router, moved IoT to VLAN 10"
memory/2026-02-05.md -> "Set up AdGuard DNS on 192.168.10.2"
memory/network.md -> "Router: Omada ER605, AdGuard: 192.168.10.2, VLAN 10: IoT"
```
Without MMR -- top 3 results:
```
1. memory/2026-02-10.md (score: 0.92) <- router + VLAN
2. memory/2026-02-08.md (score: 0.89) <- router + VLAN (near-duplicate!)
3. memory/network.md (score: 0.85) <- reference doc
```
With MMR (lambda=0.7) -- top 3 results:
```
1. memory/2026-02-10.md (score: 0.92) <- router + VLAN
2. memory/network.md (score: 0.85) <- reference doc (diverse!)
3. memory/2026-02-05.md (score: 0.78) <- AdGuard DNS (diverse!)
```
The near-duplicate from Feb 8 drops out, and the agent gets three distinct pieces of information.
**When to enable:** If you notice `memory_search` returning redundant or near-duplicate snippets,
especially with daily notes that often repeat similar information across days.
### Temporal decay (recency boost)
Agents with daily notes accumulate hundreds of dated files over time. Without decay,
a well-worded note from six months ago can outrank yesterday's update on the same topic.
**Temporal decay** applies an exponential multiplier to scores based on the age of each result,
so recent memories naturally rank higher while old ones fade:
```
decayedScore = score x e^(-lambda x ageInDays)
```
where `lambda = ln(2) / halfLifeDays`.
With the default half-life of 30 days:
- Today's notes: **100%** of original score
- 7 days ago: **~84%**
- 30 days ago: **50%**
- 90 days ago: **12.5%**
- 180 days ago: **~1.6%**
**Evergreen files are never decayed:**
- `MEMORY.md` (root memory file)
- Non-dated files in `memory/` (e.g., `memory/projects.md`, `memory/network.md`)
- These contain durable reference information that should always rank normally.
**Dated daily files** (`memory/YYYY-MM-DD.md`) use the date extracted from the filename.
Other sources (e.g., session transcripts) fall back to file modification time (`mtime`).
**Example -- query: "what's Rod's work schedule?"**
Given these memory files (today is Feb 10):
```
memory/2025-09-15.md -> "Rod works Mon-Fri, standup at 10am, pairing at 2pm" (148 days old)
memory/2026-02-10.md -> "Rod has standup at 14:15, 1:1 with Zeb at 14:45" (today)
memory/2026-02-03.md -> "Rod started new team, standup moved to 14:15" (7 days old)
```
Without decay:
```
1. memory/2025-09-15.md (score: 0.91) <- best semantic match, but stale!
2. memory/2026-02-10.md (score: 0.82)
3. memory/2026-02-03.md (score: 0.80)
```
With decay (halfLife=30):
```
1. memory/2026-02-10.md (score: 0.82 x 1.00 = 0.82) <- today, no decay
2. memory/2026-02-03.md (score: 0.80 x 0.85 = 0.68) <- 7 days, mild decay
3. memory/2025-09-15.md (score: 0.91 x 0.03 = 0.03) <- 148 days, nearly gone
```
The stale September note drops to the bottom despite having the best raw semantic match.
**When to enable:** If your agent has months of daily notes and you find that old,
stale information outranks recent context. A half-life of 30 days works well for
daily-note-heavy workflows; increase it (e.g., 90 days) if you reference older notes frequently.
### Hybrid search configuration
Both features are configured under `memorySearch.query.hybrid`:
```json5
agents: {
defaults: {
memorySearch: {
query: {
hybrid: {
enabled: true,
vectorWeight: 0.7,
textWeight: 0.3,
candidateMultiplier: 4,
// Diversity: reduce redundant results
mmr: {
enabled: true, // default: false
lambda: 0.7 // 0 = max diversity, 1 = max relevance
},
// Recency: boost newer memories
temporalDecay: {
enabled: true, // default: false
halfLifeDays: 30 // score halves every 30 days
}
}
}
}
}
}
```
You can enable either feature independently:
- **MMR only** -- useful when you have many similar notes but age doesn't matter.
- **Temporal decay only** -- useful when recency matters but your results are already diverse.
- **Both** -- recommended for agents with large, long-running daily note histories.
---
## Embedding cache
OpenClaw can cache **chunk embeddings** in SQLite so reindexing and frequent updates (especially session transcripts) don't re-embed unchanged text.
| Key | Type | Default | Description |
| ------------------ | --------- | ------- | -------------------------------- |
| `cache.enabled` | `boolean` | `false` | Cache chunk embeddings in SQLite |
| `cache.maxEntries` | `number` | `50000` | Max cached embeddings |
Config:
Prevents re-embedding unchanged text during reindex or transcript updates.
```json5
agents: {
defaults: {
memorySearch: {
cache: {
enabled: true,
maxEntries: 50000
}
}
}
}
```
---
## Batch indexing
| Key | Type | Default | Description |
| ----------------------------- | --------- | ------- | -------------------------- |
| `remote.batch.enabled` | `boolean` | `false` | Enable batch embedding API |
| `remote.batch.concurrency` | `number` | `2` | Parallel batch jobs |
| `remote.batch.wait` | `boolean` | `true` | Wait for batch completion |
| `remote.batch.pollIntervalMs` | `number` | -- | Poll interval |
| `remote.batch.timeoutMinutes` | `number` | -- | Batch timeout |
Available for `openai`, `gemini`, and `voyage`. OpenAI batch is typically
fastest and cheapest for large backfills.
---
## Session memory search (experimental)
You can optionally index **session transcripts** and surface them via `memory_search`.
This is gated behind an experimental flag.
Index session transcripts and surface them via `memory_search`:
```json5
agents: {
defaults: {
memorySearch: {
experimental: { sessionMemory: true },
sources: ["memory", "sessions"]
}
}
}
```
| Key | Type | Default | Description |
| ----------------------------- | ---------- | ------------ | --------------------------------------- |
| `experimental.sessionMemory` | `boolean` | `false` | Enable session indexing |
| `sources` | `string[]` | `["memory"]` | Add `"sessions"` to include transcripts |
| `sync.sessions.deltaBytes` | `number` | `100000` | Byte threshold for reindex |
| `sync.sessions.deltaMessages` | `number` | `50` | Message threshold for reindex |
Notes:
Session indexing is opt-in and runs asynchronously. Results can be slightly
stale. Session logs live on disk, so treat filesystem access as the trust
boundary.
- Session indexing is **opt-in** (off by default).
- Session updates are debounced and **indexed asynchronously** once they cross delta thresholds (best-effort).
- `memory_search` never blocks on indexing; results can be slightly stale until background sync finishes.
- Results still include snippets only; `memory_get` remains limited to memory files.
- Session indexing is isolated per agent (only that agent's session logs are indexed).
- Session logs live on disk (`~/.openclaw/agents/<agentId>/sessions/*.jsonl`). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.
Delta thresholds (defaults shown):
```json5
agents: {
defaults: {
memorySearch: {
sync: {
sessions: {
deltaBytes: 100000, // ~100 KB
deltaMessages: 50 // JSONL lines
}
}
}
}
}
```
---
## SQLite vector acceleration (sqlite-vec)
When the sqlite-vec extension is available, OpenClaw stores embeddings in a
SQLite virtual table (`vec0`) and performs vector distance queries in the
database. This keeps search fast without loading every embedding into JS.
| Key | Type | Default | Description |
| ---------------------------- | --------- | ------- | --------------------------------- |
| `store.vector.enabled` | `boolean` | `true` | Use sqlite-vec for vector queries |
| `store.vector.extensionPath` | `string` | bundled | Override sqlite-vec path |
Configuration (optional):
When sqlite-vec is unavailable, OpenClaw falls back to in-process cosine
similarity automatically.
---
## Index storage
| Key | Type | Default | Description |
| --------------------- | -------- | ------------------------------------- | ------------------------------------------- |
| `store.path` | `string` | `~/.openclaw/memory/{agentId}.sqlite` | Index location (supports `{agentId}` token) |
| `store.fts.tokenizer` | `string` | `unicode61` | FTS5 tokenizer (`unicode61` or `trigram`) |
---
## QMD backend config
Set `memory.backend = "qmd"` to enable. All QMD settings live under
`memory.qmd`:
| Key | Type | Default | Description |
| ------------------------ | --------- | -------- | -------------------------------------------- |
| `command` | `string` | `qmd` | QMD executable path |
| `searchMode` | `string` | `search` | Search command: `search`, `vsearch`, `query` |
| `includeDefaultMemory` | `boolean` | `true` | Auto-index `MEMORY.md` + `memory/**/*.md` |
| `paths[]` | `array` | -- | Extra paths: `{ name, path, pattern? }` |
| `sessions.enabled` | `boolean` | `false` | Index session transcripts |
| `sessions.retentionDays` | `number` | -- | Transcript retention |
| `sessions.exportDir` | `string` | -- | Export directory |
### Update schedule
| Key | Type | Default | Description |
| ------------------------- | --------- | ------- | ------------------------------------- |
| `update.interval` | `string` | `5m` | Refresh interval |
| `update.debounceMs` | `number` | `15000` | Debounce file changes |
| `update.onBoot` | `boolean` | `true` | Refresh on startup |
| `update.waitForBootSync` | `boolean` | `false` | Block startup until refresh completes |
| `update.embedInterval` | `string` | -- | Separate embed cadence |
| `update.commandTimeoutMs` | `number` | -- | Timeout for QMD commands |
### Limits
| Key | Type | Default | Description |
| ------------------------- | -------- | ------- | -------------------------- |
| `limits.maxResults` | `number` | `6` | Max search results |
| `limits.maxSnippetChars` | `number` | -- | Clamp snippet length |
| `limits.maxInjectedChars` | `number` | -- | Clamp total injected chars |
| `limits.timeoutMs` | `number` | `4000` | Search timeout |
### Scope
Controls which sessions can receive QMD search results. Same schema as
[`session.sendPolicy`](/gateway/configuration-reference#session):
```json5
agents: {
defaults: {
memorySearch: {
store: {
vector: {
enabled: true,
extensionPath: "/path/to/sqlite-vec"
}
}
}
}
{
memory: {
qmd: {
scope: {
default: "deny",
rules: [{ action: "allow", match: { chatType: "direct" } }],
},
},
},
}
```
Notes:
Default is DM-only. `match.keyPrefix` matches the normalized session key;
`match.rawKeyPrefix` matches the raw key including `agent:<id>:`.
- `enabled` defaults to true; when disabled, search falls back to in-process
cosine similarity over stored embeddings.
- If the sqlite-vec extension is missing or fails to load, OpenClaw logs the
error and continues with the JS fallback (no vector table).
- `extensionPath` overrides the bundled sqlite-vec path (useful for custom builds
or non-standard install locations).
### Citations
## Local embedding auto-download
`memory.citations` applies to all backends:
- Default local embedding model: `hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB).
- When `memorySearch.provider = "local"`, `node-llama-cpp` resolves `modelPath`; if the GGUF is missing it **auto-downloads** to the cache (or `local.modelCacheDir` if set), then loads it. Downloads resume on retry.
- Native build requirement: run `pnpm approve-builds`, pick `node-llama-cpp`, then `pnpm rebuild node-llama-cpp`.
- Fallback: if local setup fails and `memorySearch.fallback = "openai"`, we automatically switch to remote embeddings (`openai/text-embedding-3-small` unless overridden) and record the reason.
| Value | Behavior |
| ---------------- | --------------------------------------------------- |
| `auto` (default) | Include `Source: <path#line>` footer in snippets |
| `on` | Always include footer |
| `off` | Omit footer (path still passed to agent internally) |
## Custom OpenAI-compatible endpoint example
### Full QMD example
```json5
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://api.example.com/v1/",
apiKey: "YOUR_REMOTE_API_KEY",
headers: {
"X-Organization": "org-id",
"X-Project": "project-id"
}
}
}
}
{
memory: {
backend: "qmd",
citations: "auto",
qmd: {
includeDefaultMemory: true,
update: { interval: "5m", debounceMs: 15000 },
limits: { maxResults: 6, timeoutMs: 4000 },
scope: {
default: "deny",
rules: [{ action: "allow", match: { chatType: "direct" } }],
},
paths: [{ name: "docs", path: "~/notes", pattern: "**/*.md" }],
},
},
}
```
Notes:
- `remote.*` takes precedence over `models.providers.openai.*`.
- `remote.headers` merge with OpenAI headers; remote wins on key conflicts. Omit `remote.headers` to use the OpenAI defaults.