|
|
|
|
@@ -1,6 +1,6 @@
|
|
|
|
|
---
|
|
|
|
|
title: "Memory configuration reference"
|
|
|
|
|
summary: "Full configuration reference for OpenClaw memory search, embedding providers, QMD backend, hybrid search, and multimodal memory"
|
|
|
|
|
summary: "All configuration knobs for memory search, embedding providers, QMD, hybrid search, and multimodal indexing"
|
|
|
|
|
read_when:
|
|
|
|
|
- You want to configure memory search providers or embedding models
|
|
|
|
|
- You want to set up the QMD backend
|
|
|
|
|
@@ -10,711 +10,350 @@ read_when:
|
|
|
|
|
|
|
|
|
|
# Memory configuration reference
|
|
|
|
|
|
|
|
|
|
This page covers the full configuration surface for OpenClaw memory search. For
|
|
|
|
|
the conceptual overview, see [Memory](/concepts/memory). For how the search
|
|
|
|
|
pipeline works (hybrid search, MMR, temporal decay), see
|
|
|
|
|
[Memory Search](/concepts/memory-search).
|
|
|
|
|
This page lists every configuration knob for OpenClaw memory search. For
|
|
|
|
|
conceptual overviews, see:
|
|
|
|
|
|
|
|
|
|
## Memory search defaults
|
|
|
|
|
- [Memory Overview](/concepts/memory) -- how memory works
|
|
|
|
|
- [Builtin Engine](/concepts/memory-builtin) -- default SQLite backend
|
|
|
|
|
- [QMD Engine](/concepts/memory-qmd) -- local-first sidecar
|
|
|
|
|
- [Memory Search](/concepts/memory-search) -- search pipeline and tuning
|
|
|
|
|
|
|
|
|
|
- Enabled by default.
|
|
|
|
|
- Watches memory files for changes (debounced).
|
|
|
|
|
- Configure memory search under `agents.defaults.memorySearch` (not top-level
|
|
|
|
|
`memorySearch`).
|
|
|
|
|
- `memorySearch.provider` and `memorySearch.fallback` accept **adapter ids**
|
|
|
|
|
registered by the active memory plugin.
|
|
|
|
|
- The default `memory-core` plugin registers these built-in adapter ids:
|
|
|
|
|
`local`, `openai`, `gemini`, `voyage`, `mistral`, and `ollama`.
|
|
|
|
|
- With the default `memory-core` plugin, if `memorySearch.provider` is not set,
|
|
|
|
|
OpenClaw auto-selects:
|
|
|
|
|
1. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
|
|
|
|
|
2. `openai` if an OpenAI key can be resolved.
|
|
|
|
|
3. `gemini` if a Gemini key can be resolved.
|
|
|
|
|
4. `voyage` if a Voyage key can be resolved.
|
|
|
|
|
5. `mistral` if a Mistral key can be resolved.
|
|
|
|
|
6. Otherwise memory search stays disabled until configured.
|
|
|
|
|
- Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
|
|
|
|
|
- Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
|
|
|
|
|
- With the default `memory-core` plugin, `memorySearch.provider = "ollama"` is
|
|
|
|
|
also supported for local/self-hosted Ollama embeddings (`/api/embeddings`),
|
|
|
|
|
but it is not auto-selected.
|
|
|
|
|
All memory search settings live under `agents.defaults.memorySearch` in
|
|
|
|
|
`openclaw.json` unless noted otherwise.
|
|
|
|
|
|
|
|
|
|
Remote embeddings **require** an API key for the embedding provider. OpenClaw
|
|
|
|
|
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
|
|
|
|
|
variables. Codex OAuth only covers chat/completions and does **not** satisfy
|
|
|
|
|
embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
|
|
|
|
|
`models.providers.google.apiKey`. For Voyage, use `VOYAGE_API_KEY` or
|
|
|
|
|
`models.providers.voyage.apiKey`. For Mistral, use `MISTRAL_API_KEY` or
|
|
|
|
|
`models.providers.mistral.apiKey`. Ollama typically does not require a real API
|
|
|
|
|
key (a placeholder like `OLLAMA_API_KEY=ollama-local` is enough when needed by
|
|
|
|
|
local policy).
|
|
|
|
|
When using a custom OpenAI-compatible endpoint,
|
|
|
|
|
set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## QMD backend
|
|
|
|
|
## Provider selection
|
|
|
|
|
|
|
|
|
|
Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for
|
|
|
|
|
[QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines
|
|
|
|
|
BM25 + vectors + reranking. Markdown stays the source of truth; OpenClaw shells
|
|
|
|
|
out to QMD for retrieval. Key points:
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ---------- | --------- | ---------------- | -------------------------------------------------------------------------------- |
|
|
|
|
|
| `provider` | `string` | auto-detected | Embedding adapter ID: `openai`, `gemini`, `voyage`, `mistral`, `ollama`, `local` |
|
|
|
|
|
| `model` | `string` | provider default | Embedding model name |
|
|
|
|
|
| `fallback` | `string` | `"none"` | Fallback adapter ID when the primary fails |
|
|
|
|
|
| `enabled` | `boolean` | `true` | Enable or disable memory search |
|
|
|
|
|
|
|
|
|
|
### Prerequisites
|
|
|
|
|
### Auto-detection order
|
|
|
|
|
|
|
|
|
|
- Disabled by default. Opt in per-config (`memory.backend = "qmd"`).
|
|
|
|
|
- Install the QMD CLI separately (`bun install -g https://github.com/tobi/qmd` or grab
|
|
|
|
|
a release) and make sure the `qmd` binary is on the gateway's `PATH`.
|
|
|
|
|
- QMD needs an SQLite build that allows extensions (`brew install sqlite` on
|
|
|
|
|
macOS).
|
|
|
|
|
- QMD runs fully locally via Bun + `node-llama-cpp` and auto-downloads GGUF
|
|
|
|
|
models from HuggingFace on first use (no separate Ollama daemon required).
|
|
|
|
|
- The gateway runs QMD in a self-contained XDG home under
|
|
|
|
|
`~/.openclaw/agents/<agentId>/qmd/` by setting `XDG_CONFIG_HOME` and
|
|
|
|
|
`XDG_CACHE_HOME`.
|
|
|
|
|
- OS support: macOS and Linux work out of the box once Bun + SQLite are
|
|
|
|
|
installed. Windows is best supported via WSL2.
|
|
|
|
|
When `provider` is not set, OpenClaw selects the first available:
|
|
|
|
|
|
|
|
|
|
### How the sidecar runs
|
|
|
|
|
1. `local` -- if `memorySearch.local.modelPath` is configured and the file exists.
|
|
|
|
|
2. `openai` -- if an OpenAI key can be resolved.
|
|
|
|
|
3. `gemini` -- if a Gemini key can be resolved.
|
|
|
|
|
4. `voyage` -- if a Voyage key can be resolved.
|
|
|
|
|
5. `mistral` -- if a Mistral key can be resolved.
|
|
|
|
|
|
|
|
|
|
- The gateway writes a self-contained QMD home under
|
|
|
|
|
`~/.openclaw/agents/<agentId>/qmd/` (config + cache + sqlite DB).
|
|
|
|
|
- Collections are created via `qmd collection add` from `memory.qmd.paths`
|
|
|
|
|
(plus default workspace memory files), then `qmd update` + `qmd embed` run
|
|
|
|
|
on boot and on a configurable interval (`memory.qmd.update.interval`,
|
|
|
|
|
default 5 m).
|
|
|
|
|
- The gateway now initializes the QMD manager on startup, so periodic update
|
|
|
|
|
timers are armed even before the first `memory_search` call.
|
|
|
|
|
- Boot refresh now runs in the background by default so chat startup is not
|
|
|
|
|
blocked; set `memory.qmd.update.waitForBootSync = true` to keep the previous
|
|
|
|
|
blocking behavior.
|
|
|
|
|
- Searches run via `memory.qmd.searchMode` (default `qmd search --json`; also
|
|
|
|
|
supports `vsearch` and `query`). If the selected mode rejects flags on your
|
|
|
|
|
QMD build, OpenClaw retries with `qmd query`. If QMD fails or the binary is
|
|
|
|
|
missing, OpenClaw automatically falls back to the builtin SQLite manager so
|
|
|
|
|
memory tools keep working.
|
|
|
|
|
- OpenClaw does not expose QMD embed batch-size tuning today; batch behavior is
|
|
|
|
|
controlled by QMD itself.
|
|
|
|
|
- **First search may be slow**: QMD may download local GGUF models (reranker/query
|
|
|
|
|
expansion) on the first `qmd query` run.
|
|
|
|
|
- OpenClaw sets `XDG_CONFIG_HOME`/`XDG_CACHE_HOME` automatically when it runs QMD.
|
|
|
|
|
- If you want to pre-download models manually (and warm the same index OpenClaw
|
|
|
|
|
uses), run a one-off query with the agent's XDG dirs.
|
|
|
|
|
`ollama` is supported but not auto-detected (set it explicitly).
|
|
|
|
|
|
|
|
|
|
OpenClaw's QMD state lives under your **state dir** (defaults to `~/.openclaw`).
|
|
|
|
|
You can point `qmd` at the exact same index by exporting the same XDG vars
|
|
|
|
|
OpenClaw uses:
|
|
|
|
|
### API key resolution
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
# Pick the same state dir OpenClaw uses
|
|
|
|
|
STATE_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}"
|
|
|
|
|
Remote embeddings require an API key. OpenClaw resolves from:
|
|
|
|
|
auth profiles, `models.providers.*.apiKey`, or environment variables.
|
|
|
|
|
|
|
|
|
|
export XDG_CONFIG_HOME="$STATE_DIR/agents/main/qmd/xdg-config"
|
|
|
|
|
export XDG_CACHE_HOME="$STATE_DIR/agents/main/qmd/xdg-cache"
|
|
|
|
|
| Provider | Env var | Config key |
|
|
|
|
|
| -------- | ------------------------------ | --------------------------------- |
|
|
|
|
|
| OpenAI | `OPENAI_API_KEY` | `models.providers.openai.apiKey` |
|
|
|
|
|
| Gemini | `GEMINI_API_KEY` | `models.providers.google.apiKey` |
|
|
|
|
|
| Voyage | `VOYAGE_API_KEY` | `models.providers.voyage.apiKey` |
|
|
|
|
|
| Mistral | `MISTRAL_API_KEY` | `models.providers.mistral.apiKey` |
|
|
|
|
|
| Ollama | `OLLAMA_API_KEY` (placeholder) | -- |
|
|
|
|
|
|
|
|
|
|
# (Optional) force an index refresh + embeddings
|
|
|
|
|
qmd update
|
|
|
|
|
qmd embed
|
|
|
|
|
Codex OAuth covers chat/completions only and does not satisfy embedding
|
|
|
|
|
requests.
|
|
|
|
|
|
|
|
|
|
# Warm up / trigger first-time model downloads
|
|
|
|
|
qmd query "test" -c memory-root --json >/dev/null 2>&1
|
|
|
|
|
```
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
### Config surface (`memory.qmd.*`)
|
|
|
|
|
## Remote endpoint config
|
|
|
|
|
|
|
|
|
|
- `command` (default `qmd`): override the executable path.
|
|
|
|
|
- `searchMode` (default `search`): pick which QMD command backs
|
|
|
|
|
`memory_search` (`search`, `vsearch`, `query`).
|
|
|
|
|
- `includeDefaultMemory` (default `true`): auto-index `MEMORY.md` + `memory/**/*.md`.
|
|
|
|
|
- `paths[]`: add extra directories/files (`path`, optional `pattern`, optional
|
|
|
|
|
stable `name`).
|
|
|
|
|
- `sessions`: opt into session JSONL indexing (`enabled`, `retentionDays`,
|
|
|
|
|
`exportDir`).
|
|
|
|
|
- `update`: controls refresh cadence and maintenance execution:
|
|
|
|
|
(`interval`, `debounceMs`, `onBoot`, `waitForBootSync`, `embedInterval`,
|
|
|
|
|
`commandTimeoutMs`, `updateTimeoutMs`, `embedTimeoutMs`).
|
|
|
|
|
- `limits`: clamp recall payload (`maxResults`, `maxSnippetChars`,
|
|
|
|
|
`maxInjectedChars`, `timeoutMs`).
|
|
|
|
|
- `scope`: same schema as [`session.sendPolicy`](/gateway/configuration-reference#session).
|
|
|
|
|
Default is DM-only (`deny` all, `allow` direct chats); loosen it to surface QMD
|
|
|
|
|
hits in groups/channels.
|
|
|
|
|
- `match.keyPrefix` matches the **normalized** session key (lowercased, with any
|
|
|
|
|
leading `agent:<id>:` stripped). Example: `discord:channel:`.
|
|
|
|
|
- `match.rawKeyPrefix` matches the **raw** session key (lowercased), including
|
|
|
|
|
`agent:<id>:`. Example: `agent:main:discord:`.
|
|
|
|
|
- Legacy: `match.keyPrefix: "agent:..."` is still treated as a raw-key prefix,
|
|
|
|
|
but prefer `rawKeyPrefix` for clarity.
|
|
|
|
|
- When `scope` denies a search, OpenClaw logs a warning with the derived
|
|
|
|
|
`channel`/`chatType` so empty results are easier to debug.
|
|
|
|
|
- Snippets sourced outside the workspace show up as
|
|
|
|
|
`qmd/<collection>/<relative-path>` in `memory_search` results; `memory_get`
|
|
|
|
|
understands that prefix and reads from the configured QMD collection root.
|
|
|
|
|
- When `memory.qmd.sessions.enabled = true`, OpenClaw exports sanitized session
|
|
|
|
|
transcripts (User/Assistant turns) into a dedicated QMD collection under
|
|
|
|
|
`~/.openclaw/agents/<id>/qmd/sessions/`, so `memory_search` can recall recent
|
|
|
|
|
conversations without touching the builtin SQLite index.
|
|
|
|
|
- `memory_search` snippets now include a `Source: <path#line>` footer when
|
|
|
|
|
`memory.citations` is `auto`/`on`; set `memory.citations = "off"` to keep
|
|
|
|
|
the path metadata internal (the agent still receives the path for
|
|
|
|
|
`memory_get`, but the snippet text omits the footer and the system prompt
|
|
|
|
|
warns the agent not to cite it).
|
|
|
|
|
For custom OpenAI-compatible endpoints or overriding provider defaults:
|
|
|
|
|
|
|
|
|
|
### QMD example
|
|
|
|
|
| Key | Type | Description |
|
|
|
|
|
| ---------------- | -------- | -------------------------------------------------- |
|
|
|
|
|
| `remote.baseUrl` | `string` | Custom API base URL |
|
|
|
|
|
| `remote.apiKey` | `string` | Override API key |
|
|
|
|
|
| `remote.headers` | `object` | Extra HTTP headers (merged with provider defaults) |
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
memory: {
|
|
|
|
|
backend: "qmd",
|
|
|
|
|
citations: "auto",
|
|
|
|
|
qmd: {
|
|
|
|
|
includeDefaultMemory: true,
|
|
|
|
|
update: { interval: "5m", debounceMs: 15000 },
|
|
|
|
|
limits: { maxResults: 6, timeoutMs: 4000 },
|
|
|
|
|
scope: {
|
|
|
|
|
default: "deny",
|
|
|
|
|
rules: [
|
|
|
|
|
{ action: "allow", match: { chatType: "direct" } },
|
|
|
|
|
// Normalized session-key prefix (strips `agent:<id>:`).
|
|
|
|
|
{ action: "deny", match: { keyPrefix: "discord:channel:" } },
|
|
|
|
|
// Raw session-key prefix (includes `agent:<id>:`).
|
|
|
|
|
{ action: "deny", match: { rawKeyPrefix: "agent:main:discord:" } },
|
|
|
|
|
]
|
|
|
|
|
{
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
provider: "openai",
|
|
|
|
|
model: "text-embedding-3-small",
|
|
|
|
|
remote: {
|
|
|
|
|
baseUrl: "https://api.example.com/v1/",
|
|
|
|
|
apiKey: "YOUR_KEY",
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
paths: [
|
|
|
|
|
{ name: "docs", path: "~/notes", pattern: "**/*.md" }
|
|
|
|
|
]
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Citations and fallback
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
- `memory.citations` applies regardless of backend (`auto`/`on`/`off`).
|
|
|
|
|
- When `qmd` runs, we tag `status().backend = "qmd"` so diagnostics show which
|
|
|
|
|
engine served the results. If the QMD subprocess exits or JSON output can't be
|
|
|
|
|
parsed, the search manager logs a warning and returns the builtin provider
|
|
|
|
|
(existing Markdown embeddings) until QMD recovers.
|
|
|
|
|
## Gemini-specific config
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ---------------------- | -------- | ---------------------- | ------------------------------------------ |
|
|
|
|
|
| `model` | `string` | `gemini-embedding-001` | Also supports `gemini-embedding-2-preview` |
|
|
|
|
|
| `outputDimensionality` | `number` | `3072` | For Embedding 2: 768, 1536, or 3072 |
|
|
|
|
|
|
|
|
|
|
<Warning>
|
|
|
|
|
Changing model or `outputDimensionality` triggers an automatic full reindex.
|
|
|
|
|
</Warning>
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Local embedding config
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| --------------------- | -------- | ---------------------- | ------------------------------- |
|
|
|
|
|
| `local.modelPath` | `string` | auto-downloaded | Path to GGUF model file |
|
|
|
|
|
| `local.modelCacheDir` | `string` | node-llama-cpp default | Cache dir for downloaded models |
|
|
|
|
|
|
|
|
|
|
Default model: `embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB, auto-downloaded).
|
|
|
|
|
Requires native build: `pnpm approve-builds` then `pnpm rebuild node-llama-cpp`.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Hybrid search config
|
|
|
|
|
|
|
|
|
|
All under `memorySearch.query.hybrid`:
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| --------------------- | --------- | ------- | ---------------------------------- |
|
|
|
|
|
| `enabled` | `boolean` | `true` | Enable hybrid BM25 + vector search |
|
|
|
|
|
| `vectorWeight` | `number` | `0.7` | Weight for vector scores (0-1) |
|
|
|
|
|
| `textWeight` | `number` | `0.3` | Weight for BM25 scores (0-1) |
|
|
|
|
|
| `candidateMultiplier` | `number` | `4` | Candidate pool size multiplier |
|
|
|
|
|
|
|
|
|
|
### MMR (diversity)
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ------------- | --------- | ------- | ------------------------------------ |
|
|
|
|
|
| `mmr.enabled` | `boolean` | `false` | Enable MMR re-ranking |
|
|
|
|
|
| `mmr.lambda` | `number` | `0.7` | 0 = max diversity, 1 = max relevance |
|
|
|
|
|
|
|
|
|
|
### Temporal decay (recency)
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ---------------------------- | --------- | ------- | ------------------------- |
|
|
|
|
|
| `temporalDecay.enabled` | `boolean` | `false` | Enable recency boost |
|
|
|
|
|
| `temporalDecay.halfLifeDays` | `number` | `30` | Score halves every N days |
|
|
|
|
|
|
|
|
|
|
Evergreen files (`MEMORY.md`, non-dated files in `memory/`) are never decayed.
|
|
|
|
|
|
|
|
|
|
### Full example
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
{
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
query: {
|
|
|
|
|
hybrid: {
|
|
|
|
|
vectorWeight: 0.7,
|
|
|
|
|
textWeight: 0.3,
|
|
|
|
|
mmr: { enabled: true, lambda: 0.7 },
|
|
|
|
|
temporalDecay: { enabled: true, halfLifeDays: 30 },
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Additional memory paths
|
|
|
|
|
|
|
|
|
|
If you want to index Markdown files outside the default workspace layout, add
|
|
|
|
|
explicit paths:
|
|
|
|
|
| Key | Type | Description |
|
|
|
|
|
| ------------ | ---------- | ---------------------------------------- |
|
|
|
|
|
| `extraPaths` | `string[]` | Additional directories or files to index |
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"]
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
|
|
|
|
|
- Paths can be absolute or workspace-relative.
|
|
|
|
|
- Directories are scanned recursively for `.md` files.
|
|
|
|
|
- By default, only Markdown files are indexed.
|
|
|
|
|
- If `memorySearch.multimodal.enabled = true`, OpenClaw also indexes supported image/audio files under `extraPaths` only. Default memory roots (`MEMORY.md`, `memory.md`, `memory/**/*.md`) stay Markdown-only.
|
|
|
|
|
- Symlinks are ignored (files or directories).
|
|
|
|
|
|
|
|
|
|
## Multimodal memory files (Gemini image + audio)
|
|
|
|
|
|
|
|
|
|
OpenClaw can index image and audio files from `memorySearch.extraPaths` when using Gemini embedding 2:
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
provider: "gemini",
|
|
|
|
|
model: "gemini-embedding-2-preview",
|
|
|
|
|
extraPaths: ["assets/reference", "voice-notes"],
|
|
|
|
|
multimodal: {
|
|
|
|
|
enabled: true,
|
|
|
|
|
modalities: ["image", "audio"], // or ["all"]
|
|
|
|
|
maxFileBytes: 10000000
|
|
|
|
|
{
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
extraPaths: ["../team-docs", "/srv/shared-notes"],
|
|
|
|
|
},
|
|
|
|
|
remote: {
|
|
|
|
|
apiKey: "YOUR_GEMINI_API_KEY"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
Paths can be absolute or workspace-relative. Directories are scanned
|
|
|
|
|
recursively for `.md` files. Symlinks are ignored.
|
|
|
|
|
|
|
|
|
|
- Multimodal memory is currently supported only for `gemini-embedding-2-preview`.
|
|
|
|
|
- Multimodal indexing applies only to files discovered through `memorySearch.extraPaths`.
|
|
|
|
|
- Supported modalities in this phase: image and audio.
|
|
|
|
|
- `memorySearch.fallback` must stay `"none"` while multimodal memory is enabled.
|
|
|
|
|
- Matching image/audio file bytes are uploaded to the configured Gemini embedding endpoint during indexing.
|
|
|
|
|
- Supported image extensions: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`, `.heic`, `.heif`.
|
|
|
|
|
- Supported audio extensions: `.mp3`, `.wav`, `.ogg`, `.opus`, `.m4a`, `.aac`, `.flac`.
|
|
|
|
|
- Search queries remain text, but Gemini can compare those text queries against indexed image/audio embeddings.
|
|
|
|
|
- `memory_get` still reads Markdown only; binary files are searchable but not returned as raw file contents.
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Gemini embeddings (native)
|
|
|
|
|
## Multimodal memory (Gemini)
|
|
|
|
|
|
|
|
|
|
Set the provider to `gemini` to use the Gemini embeddings API directly:
|
|
|
|
|
Index images and audio alongside Markdown using Gemini Embedding 2:
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
provider: "gemini",
|
|
|
|
|
model: "gemini-embedding-001",
|
|
|
|
|
remote: {
|
|
|
|
|
apiKey: "YOUR_GEMINI_API_KEY"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ------------------------- | ---------- | ---------- | -------------------------------------- |
|
|
|
|
|
| `multimodal.enabled` | `boolean` | `false` | Enable multimodal indexing |
|
|
|
|
|
| `multimodal.modalities` | `string[]` | -- | `["image"]`, `["audio"]`, or `["all"]` |
|
|
|
|
|
| `multimodal.maxFileBytes` | `number` | `10000000` | Max file size for indexing |
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
Only applies to files in `extraPaths`. Default memory roots stay Markdown-only.
|
|
|
|
|
Requires `gemini-embedding-2-preview`. `fallback` must be `"none"`.
|
|
|
|
|
|
|
|
|
|
- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
|
|
|
|
|
- `remote.headers` lets you add extra headers if needed.
|
|
|
|
|
- Default model: `gemini-embedding-001`.
|
|
|
|
|
- `gemini-embedding-2-preview` is also supported: 8192 token limit and configurable dimensions (768 / 1536 / 3072, default 3072).
|
|
|
|
|
Supported formats: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`, `.heic`, `.heif`
|
|
|
|
|
(images); `.mp3`, `.wav`, `.ogg`, `.opus`, `.m4a`, `.aac`, `.flac` (audio).
|
|
|
|
|
|
|
|
|
|
### Gemini Embedding 2 (preview)
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
provider: "gemini",
|
|
|
|
|
model: "gemini-embedding-2-preview",
|
|
|
|
|
outputDimensionality: 3072, // optional: 768, 1536, or 3072 (default)
|
|
|
|
|
remote: {
|
|
|
|
|
apiKey: "YOUR_GEMINI_API_KEY"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
> **Re-index required:** Switching from `gemini-embedding-001` (768 dimensions)
|
|
|
|
|
> to `gemini-embedding-2-preview` (3072 dimensions) changes the vector size. The same is true if you
|
|
|
|
|
> change `outputDimensionality` between 768, 1536, and 3072.
|
|
|
|
|
> OpenClaw will automatically reindex when it detects a model or dimension change.
|
|
|
|
|
|
|
|
|
|
## Custom OpenAI-compatible endpoint
|
|
|
|
|
|
|
|
|
|
If you want to use a custom OpenAI-compatible endpoint (OpenRouter, vLLM, or a proxy),
|
|
|
|
|
you can use the `remote` configuration with the OpenAI provider:
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
provider: "openai",
|
|
|
|
|
model: "text-embedding-3-small",
|
|
|
|
|
remote: {
|
|
|
|
|
baseUrl: "https://api.example.com/v1/",
|
|
|
|
|
apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
|
|
|
|
|
headers: { "X-Custom-Header": "value" }
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
If you don't want to set an API key, use `memorySearch.provider = "local"` or set
|
|
|
|
|
`memorySearch.fallback = "none"`.
|
|
|
|
|
|
|
|
|
|
### Fallbacks
|
|
|
|
|
|
|
|
|
|
- `memorySearch.fallback` can be any registered memory embedding adapter id, or `none`.
|
|
|
|
|
- With the default `memory-core` plugin, valid built-in fallback ids are `openai`, `gemini`, `voyage`, `mistral`, `ollama`, and `local`.
|
|
|
|
|
- The fallback provider is only used when the primary embedding provider fails.
|
|
|
|
|
|
|
|
|
|
### Batch indexing
|
|
|
|
|
|
|
|
|
|
- Disabled by default. Set `agents.defaults.memorySearch.remote.batch.enabled = true` to enable batch indexing for providers whose adapter exposes batch support.
|
|
|
|
|
- Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
|
|
|
|
|
- Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
|
|
|
|
|
- With the default `memory-core` plugin, batch indexing is available for `openai`, `gemini`, and `voyage`.
|
|
|
|
|
- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
|
|
|
|
|
|
|
|
|
|
Why OpenAI batch is fast and cheap:
|
|
|
|
|
|
|
|
|
|
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
|
|
|
|
|
- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
|
|
|
|
|
- See the OpenAI Batch API docs and pricing for details:
|
|
|
|
|
- [https://platform.openai.com/docs/api-reference/batch](https://platform.openai.com/docs/api-reference/batch)
|
|
|
|
|
- [https://platform.openai.com/pricing](https://platform.openai.com/pricing)
|
|
|
|
|
|
|
|
|
|
Config example:
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
provider: "openai",
|
|
|
|
|
model: "text-embedding-3-small",
|
|
|
|
|
fallback: "openai",
|
|
|
|
|
remote: {
|
|
|
|
|
batch: { enabled: true, concurrency: 2 }
|
|
|
|
|
},
|
|
|
|
|
sync: { watch: true }
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## How the memory tools work
|
|
|
|
|
|
|
|
|
|
- `memory_search` semantically searches Markdown chunks (~400 token target, 80-token overlap) from `MEMORY.md` + `memory/**/*.md`. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local to remote embeddings. No full file payload is returned.
|
|
|
|
|
- `memory_get` reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside `MEMORY.md` / `memory/` are rejected.
|
|
|
|
|
- Both tools are enabled only when `memorySearch.enabled` resolves true for the agent.
|
|
|
|
|
|
|
|
|
|
## What gets indexed (and when)
|
|
|
|
|
|
|
|
|
|
- File type: Markdown only (`MEMORY.md`, `memory/**/*.md`).
|
|
|
|
|
- Index storage: per-agent SQLite at `~/.openclaw/memory/<agentId>.sqlite` (configurable via `agents.defaults.memorySearch.store.path`, supports `{agentId}` token).
|
|
|
|
|
- Freshness: watcher on `MEMORY.md` + `memory/` marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync.
|
|
|
|
|
- Reindex triggers: the index stores the embedding **provider/model + endpoint fingerprint + chunking params**. If any of those change, OpenClaw automatically resets and reindexes the entire store.
|
|
|
|
|
|
|
|
|
|
## Hybrid search (BM25 + vector)
|
|
|
|
|
|
|
|
|
|
When enabled, OpenClaw combines:
|
|
|
|
|
|
|
|
|
|
- **Vector similarity** (semantic match, wording can differ)
|
|
|
|
|
- **BM25 keyword relevance** (exact tokens like IDs, env vars, code symbols)
|
|
|
|
|
|
|
|
|
|
If full-text search is unavailable on your platform, OpenClaw falls back to vector-only search.
|
|
|
|
|
|
|
|
|
|
### Why hybrid
|
|
|
|
|
|
|
|
|
|
Vector search is great at "this means the same thing":
|
|
|
|
|
|
|
|
|
|
- "Mac Studio gateway host" vs "the machine running the gateway"
|
|
|
|
|
- "debounce file updates" vs "avoid indexing on every write"
|
|
|
|
|
|
|
|
|
|
But it can be weak at exact, high-signal tokens:
|
|
|
|
|
|
|
|
|
|
- IDs (`a828e60`, `b3b9895a...`)
|
|
|
|
|
- code symbols (`memorySearch.query.hybrid`)
|
|
|
|
|
- error strings ("sqlite-vec unavailable")
|
|
|
|
|
|
|
|
|
|
BM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases.
|
|
|
|
|
Hybrid search is the pragmatic middle ground: **use both retrieval signals** so you get
|
|
|
|
|
good results for both "natural language" queries and "needle in a haystack" queries.
|
|
|
|
|
|
|
|
|
|
### How we merge results (the current design)
|
|
|
|
|
|
|
|
|
|
Implementation sketch:
|
|
|
|
|
|
|
|
|
|
1. Retrieve a candidate pool from both sides:
|
|
|
|
|
|
|
|
|
|
- **Vector**: top `maxResults * candidateMultiplier` by cosine similarity.
|
|
|
|
|
- **BM25**: top `maxResults * candidateMultiplier` by FTS5 BM25 rank (lower is better).
|
|
|
|
|
|
|
|
|
|
2. Convert BM25 rank into a 0..1-ish score:
|
|
|
|
|
|
|
|
|
|
- `textScore = 1 / (1 + max(0, bm25Rank))`
|
|
|
|
|
|
|
|
|
|
3. Union candidates by chunk id and compute a weighted score:
|
|
|
|
|
|
|
|
|
|
- `finalScore = vectorWeight * vectorScore + textWeight * textScore`
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
|
|
|
|
|
- `vectorWeight` + `textWeight` is normalized to 1.0 in config resolution, so weights behave as percentages.
|
|
|
|
|
- If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.
|
|
|
|
|
- If FTS5 can't be created, we keep vector-only search (no hard failure).
|
|
|
|
|
- **CJK support**: FTS5 uses configurable trigram tokenization with a short-substring fallback so Chinese, Japanese, and Korean text is searchable without breaking mixed-length queries. CJK-heavy text is also weighted correctly during chunk size estimation, and surrogate-pair characters are preserved during fine splits.
|
|
|
|
|
|
|
|
|
|
This isn't "IR-theory perfect", but it's simple, fast, and tends to improve recall/precision on real notes.
|
|
|
|
|
If we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization
|
|
|
|
|
(min/max or z-score) before mixing.
|
|
|
|
|
|
|
|
|
|
### Post-processing pipeline
|
|
|
|
|
|
|
|
|
|
After merging vector and keyword scores, two optional post-processing stages
|
|
|
|
|
refine the result list before it reaches the agent:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
Vector + Keyword -> Weighted Merge -> Temporal Decay -> Sort -> MMR -> Top-K Results
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Both stages are **off by default** and can be enabled independently.
|
|
|
|
|
|
|
|
|
|
### MMR re-ranking (diversity)
|
|
|
|
|
|
|
|
|
|
When hybrid search returns results, multiple chunks may contain similar or overlapping content.
|
|
|
|
|
For example, searching for "home network setup" might return five nearly identical snippets
|
|
|
|
|
from different daily notes that all mention the same router configuration.
|
|
|
|
|
|
|
|
|
|
**MMR (Maximal Marginal Relevance)** re-ranks the results to balance relevance with diversity,
|
|
|
|
|
ensuring the top results cover different aspects of the query instead of repeating the same information.
|
|
|
|
|
|
|
|
|
|
How it works:
|
|
|
|
|
|
|
|
|
|
1. Results are scored by their original relevance (vector + BM25 weighted score).
|
|
|
|
|
2. MMR iteratively selects results that maximize: `lambda x relevance - (1-lambda) x max_similarity_to_selected`.
|
|
|
|
|
3. Similarity between results is measured using Jaccard text similarity on tokenized content.
|
|
|
|
|
|
|
|
|
|
The `lambda` parameter controls the trade-off:
|
|
|
|
|
|
|
|
|
|
- `lambda = 1.0` -- pure relevance (no diversity penalty)
|
|
|
|
|
- `lambda = 0.0` -- maximum diversity (ignores relevance)
|
|
|
|
|
- Default: `0.7` (balanced, slight relevance bias)
|
|
|
|
|
|
|
|
|
|
**Example -- query: "home network setup"**
|
|
|
|
|
|
|
|
|
|
Given these memory files:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
memory/2026-02-10.md -> "Configured Omada router, set VLAN 10 for IoT devices"
|
|
|
|
|
memory/2026-02-08.md -> "Configured Omada router, moved IoT to VLAN 10"
|
|
|
|
|
memory/2026-02-05.md -> "Set up AdGuard DNS on 192.168.10.2"
|
|
|
|
|
memory/network.md -> "Router: Omada ER605, AdGuard: 192.168.10.2, VLAN 10: IoT"
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Without MMR -- top 3 results:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
1. memory/2026-02-10.md (score: 0.92) <- router + VLAN
|
|
|
|
|
2. memory/2026-02-08.md (score: 0.89) <- router + VLAN (near-duplicate!)
|
|
|
|
|
3. memory/network.md (score: 0.85) <- reference doc
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
With MMR (lambda=0.7) -- top 3 results:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
1. memory/2026-02-10.md (score: 0.92) <- router + VLAN
|
|
|
|
|
2. memory/network.md (score: 0.85) <- reference doc (diverse!)
|
|
|
|
|
3. memory/2026-02-05.md (score: 0.78) <- AdGuard DNS (diverse!)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
The near-duplicate from Feb 8 drops out, and the agent gets three distinct pieces of information.
|
|
|
|
|
|
|
|
|
|
**When to enable:** If you notice `memory_search` returning redundant or near-duplicate snippets,
|
|
|
|
|
especially with daily notes that often repeat similar information across days.
|
|
|
|
|
|
|
|
|
|
### Temporal decay (recency boost)
|
|
|
|
|
|
|
|
|
|
Agents with daily notes accumulate hundreds of dated files over time. Without decay,
|
|
|
|
|
a well-worded note from six months ago can outrank yesterday's update on the same topic.
|
|
|
|
|
|
|
|
|
|
**Temporal decay** applies an exponential multiplier to scores based on the age of each result,
|
|
|
|
|
so recent memories naturally rank higher while old ones fade:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
decayedScore = score x e^(-lambda x ageInDays)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
where `lambda = ln(2) / halfLifeDays`.
|
|
|
|
|
|
|
|
|
|
With the default half-life of 30 days:
|
|
|
|
|
|
|
|
|
|
- Today's notes: **100%** of original score
|
|
|
|
|
- 7 days ago: **~84%**
|
|
|
|
|
- 30 days ago: **50%**
|
|
|
|
|
- 90 days ago: **12.5%**
|
|
|
|
|
- 180 days ago: **~1.6%**
|
|
|
|
|
|
|
|
|
|
**Evergreen files are never decayed:**
|
|
|
|
|
|
|
|
|
|
- `MEMORY.md` (root memory file)
|
|
|
|
|
- Non-dated files in `memory/` (e.g., `memory/projects.md`, `memory/network.md`)
|
|
|
|
|
- These contain durable reference information that should always rank normally.
|
|
|
|
|
|
|
|
|
|
**Dated daily files** (`memory/YYYY-MM-DD.md`) use the date extracted from the filename.
|
|
|
|
|
Other sources (e.g., session transcripts) fall back to file modification time (`mtime`).
|
|
|
|
|
|
|
|
|
|
**Example -- query: "what's Rod's work schedule?"**
|
|
|
|
|
|
|
|
|
|
Given these memory files (today is Feb 10):
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
memory/2025-09-15.md -> "Rod works Mon-Fri, standup at 10am, pairing at 2pm" (148 days old)
|
|
|
|
|
memory/2026-02-10.md -> "Rod has standup at 14:15, 1:1 with Zeb at 14:45" (today)
|
|
|
|
|
memory/2026-02-03.md -> "Rod started new team, standup moved to 14:15" (7 days old)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Without decay:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
1. memory/2025-09-15.md (score: 0.91) <- best semantic match, but stale!
|
|
|
|
|
2. memory/2026-02-10.md (score: 0.82)
|
|
|
|
|
3. memory/2026-02-03.md (score: 0.80)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
With decay (halfLife=30):
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
1. memory/2026-02-10.md (score: 0.82 x 1.00 = 0.82) <- today, no decay
|
|
|
|
|
2. memory/2026-02-03.md (score: 0.80 x 0.85 = 0.68) <- 7 days, mild decay
|
|
|
|
|
3. memory/2025-09-15.md (score: 0.91 x 0.03 = 0.03) <- 148 days, nearly gone
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
The stale September note drops to the bottom despite having the best raw semantic match.
|
|
|
|
|
|
|
|
|
|
**When to enable:** If your agent has months of daily notes and you find that old,
|
|
|
|
|
stale information outranks recent context. A half-life of 30 days works well for
|
|
|
|
|
daily-note-heavy workflows; increase it (e.g., 90 days) if you reference older notes frequently.
|
|
|
|
|
|
|
|
|
|
### Hybrid search configuration
|
|
|
|
|
|
|
|
|
|
Both features are configured under `memorySearch.query.hybrid`:
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
query: {
|
|
|
|
|
hybrid: {
|
|
|
|
|
enabled: true,
|
|
|
|
|
vectorWeight: 0.7,
|
|
|
|
|
textWeight: 0.3,
|
|
|
|
|
candidateMultiplier: 4,
|
|
|
|
|
// Diversity: reduce redundant results
|
|
|
|
|
mmr: {
|
|
|
|
|
enabled: true, // default: false
|
|
|
|
|
lambda: 0.7 // 0 = max diversity, 1 = max relevance
|
|
|
|
|
},
|
|
|
|
|
// Recency: boost newer memories
|
|
|
|
|
temporalDecay: {
|
|
|
|
|
enabled: true, // default: false
|
|
|
|
|
halfLifeDays: 30 // score halves every 30 days
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
You can enable either feature independently:
|
|
|
|
|
|
|
|
|
|
- **MMR only** -- useful when you have many similar notes but age doesn't matter.
|
|
|
|
|
- **Temporal decay only** -- useful when recency matters but your results are already diverse.
|
|
|
|
|
- **Both** -- recommended for agents with large, long-running daily note histories.
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Embedding cache
|
|
|
|
|
|
|
|
|
|
OpenClaw can cache **chunk embeddings** in SQLite so reindexing and frequent updates (especially session transcripts) don't re-embed unchanged text.
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ------------------ | --------- | ------- | -------------------------------- |
|
|
|
|
|
| `cache.enabled` | `boolean` | `false` | Cache chunk embeddings in SQLite |
|
|
|
|
|
| `cache.maxEntries` | `number` | `50000` | Max cached embeddings |
|
|
|
|
|
|
|
|
|
|
Config:
|
|
|
|
|
Prevents re-embedding unchanged text during reindex or transcript updates.
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
cache: {
|
|
|
|
|
enabled: true,
|
|
|
|
|
maxEntries: 50000
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Batch indexing
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ----------------------------- | --------- | ------- | -------------------------- |
|
|
|
|
|
| `remote.batch.enabled` | `boolean` | `false` | Enable batch embedding API |
|
|
|
|
|
| `remote.batch.concurrency` | `number` | `2` | Parallel batch jobs |
|
|
|
|
|
| `remote.batch.wait` | `boolean` | `true` | Wait for batch completion |
|
|
|
|
|
| `remote.batch.pollIntervalMs` | `number` | -- | Poll interval |
|
|
|
|
|
| `remote.batch.timeoutMinutes` | `number` | -- | Batch timeout |
|
|
|
|
|
|
|
|
|
|
Available for `openai`, `gemini`, and `voyage`. OpenAI batch is typically
|
|
|
|
|
fastest and cheapest for large backfills.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Session memory search (experimental)
|
|
|
|
|
|
|
|
|
|
You can optionally index **session transcripts** and surface them via `memory_search`.
|
|
|
|
|
This is gated behind an experimental flag.
|
|
|
|
|
Index session transcripts and surface them via `memory_search`:
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
experimental: { sessionMemory: true },
|
|
|
|
|
sources: ["memory", "sessions"]
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ----------------------------- | ---------- | ------------ | --------------------------------------- |
|
|
|
|
|
| `experimental.sessionMemory` | `boolean` | `false` | Enable session indexing |
|
|
|
|
|
| `sources` | `string[]` | `["memory"]` | Add `"sessions"` to include transcripts |
|
|
|
|
|
| `sync.sessions.deltaBytes` | `number` | `100000` | Byte threshold for reindex |
|
|
|
|
|
| `sync.sessions.deltaMessages` | `number` | `50` | Message threshold for reindex |
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
Session indexing is opt-in and runs asynchronously. Results can be slightly
|
|
|
|
|
stale. Session logs live on disk, so treat filesystem access as the trust
|
|
|
|
|
boundary.
|
|
|
|
|
|
|
|
|
|
- Session indexing is **opt-in** (off by default).
|
|
|
|
|
- Session updates are debounced and **indexed asynchronously** once they cross delta thresholds (best-effort).
|
|
|
|
|
- `memory_search` never blocks on indexing; results can be slightly stale until background sync finishes.
|
|
|
|
|
- Results still include snippets only; `memory_get` remains limited to memory files.
|
|
|
|
|
- Session indexing is isolated per agent (only that agent's session logs are indexed).
|
|
|
|
|
- Session logs live on disk (`~/.openclaw/agents/<agentId>/sessions/*.jsonl`). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.
|
|
|
|
|
|
|
|
|
|
Delta thresholds (defaults shown):
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
sync: {
|
|
|
|
|
sessions: {
|
|
|
|
|
deltaBytes: 100000, // ~100 KB
|
|
|
|
|
deltaMessages: 50 // JSONL lines
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## SQLite vector acceleration (sqlite-vec)
|
|
|
|
|
|
|
|
|
|
When the sqlite-vec extension is available, OpenClaw stores embeddings in a
|
|
|
|
|
SQLite virtual table (`vec0`) and performs vector distance queries in the
|
|
|
|
|
database. This keeps search fast without loading every embedding into JS.
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ---------------------------- | --------- | ------- | --------------------------------- |
|
|
|
|
|
| `store.vector.enabled` | `boolean` | `true` | Use sqlite-vec for vector queries |
|
|
|
|
|
| `store.vector.extensionPath` | `string` | bundled | Override sqlite-vec path |
|
|
|
|
|
|
|
|
|
|
Configuration (optional):
|
|
|
|
|
When sqlite-vec is unavailable, OpenClaw falls back to in-process cosine
|
|
|
|
|
similarity automatically.
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## Index storage
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| --------------------- | -------- | ------------------------------------- | ------------------------------------------- |
|
|
|
|
|
| `store.path` | `string` | `~/.openclaw/memory/{agentId}.sqlite` | Index location (supports `{agentId}` token) |
|
|
|
|
|
| `store.fts.tokenizer` | `string` | `unicode61` | FTS5 tokenizer (`unicode61` or `trigram`) |
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
|
|
|
## QMD backend config
|
|
|
|
|
|
|
|
|
|
Set `memory.backend = "qmd"` to enable. All QMD settings live under
|
|
|
|
|
`memory.qmd`:
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ------------------------ | --------- | -------- | -------------------------------------------- |
|
|
|
|
|
| `command` | `string` | `qmd` | QMD executable path |
|
|
|
|
|
| `searchMode` | `string` | `search` | Search command: `search`, `vsearch`, `query` |
|
|
|
|
|
| `includeDefaultMemory` | `boolean` | `true` | Auto-index `MEMORY.md` + `memory/**/*.md` |
|
|
|
|
|
| `paths[]` | `array` | -- | Extra paths: `{ name, path, pattern? }` |
|
|
|
|
|
| `sessions.enabled` | `boolean` | `false` | Index session transcripts |
|
|
|
|
|
| `sessions.retentionDays` | `number` | -- | Transcript retention |
|
|
|
|
|
| `sessions.exportDir` | `string` | -- | Export directory |
|
|
|
|
|
|
|
|
|
|
### Update schedule
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ------------------------- | --------- | ------- | ------------------------------------- |
|
|
|
|
|
| `update.interval` | `string` | `5m` | Refresh interval |
|
|
|
|
|
| `update.debounceMs` | `number` | `15000` | Debounce file changes |
|
|
|
|
|
| `update.onBoot` | `boolean` | `true` | Refresh on startup |
|
|
|
|
|
| `update.waitForBootSync` | `boolean` | `false` | Block startup until refresh completes |
|
|
|
|
|
| `update.embedInterval` | `string` | -- | Separate embed cadence |
|
|
|
|
|
| `update.commandTimeoutMs` | `number` | -- | Timeout for QMD commands |
|
|
|
|
|
|
|
|
|
|
### Limits
|
|
|
|
|
|
|
|
|
|
| Key | Type | Default | Description |
|
|
|
|
|
| ------------------------- | -------- | ------- | -------------------------- |
|
|
|
|
|
| `limits.maxResults` | `number` | `6` | Max search results |
|
|
|
|
|
| `limits.maxSnippetChars` | `number` | -- | Clamp snippet length |
|
|
|
|
|
| `limits.maxInjectedChars` | `number` | -- | Clamp total injected chars |
|
|
|
|
|
| `limits.timeoutMs` | `number` | `4000` | Search timeout |
|
|
|
|
|
|
|
|
|
|
### Scope
|
|
|
|
|
|
|
|
|
|
Controls which sessions can receive QMD search results. Same schema as
|
|
|
|
|
[`session.sendPolicy`](/gateway/configuration-reference#session):
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
store: {
|
|
|
|
|
vector: {
|
|
|
|
|
enabled: true,
|
|
|
|
|
extensionPath: "/path/to/sqlite-vec"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
{
|
|
|
|
|
memory: {
|
|
|
|
|
qmd: {
|
|
|
|
|
scope: {
|
|
|
|
|
default: "deny",
|
|
|
|
|
rules: [{ action: "allow", match: { chatType: "direct" } }],
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
Default is DM-only. `match.keyPrefix` matches the normalized session key;
|
|
|
|
|
`match.rawKeyPrefix` matches the raw key including `agent:<id>:`.
|
|
|
|
|
|
|
|
|
|
- `enabled` defaults to true; when disabled, search falls back to in-process
|
|
|
|
|
cosine similarity over stored embeddings.
|
|
|
|
|
- If the sqlite-vec extension is missing or fails to load, OpenClaw logs the
|
|
|
|
|
error and continues with the JS fallback (no vector table).
|
|
|
|
|
- `extensionPath` overrides the bundled sqlite-vec path (useful for custom builds
|
|
|
|
|
or non-standard install locations).
|
|
|
|
|
### Citations
|
|
|
|
|
|
|
|
|
|
## Local embedding auto-download
|
|
|
|
|
`memory.citations` applies to all backends:
|
|
|
|
|
|
|
|
|
|
- Default local embedding model: `hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf` (~0.6 GB).
|
|
|
|
|
- When `memorySearch.provider = "local"`, `node-llama-cpp` resolves `modelPath`; if the GGUF is missing it **auto-downloads** to the cache (or `local.modelCacheDir` if set), then loads it. Downloads resume on retry.
|
|
|
|
|
- Native build requirement: run `pnpm approve-builds`, pick `node-llama-cpp`, then `pnpm rebuild node-llama-cpp`.
|
|
|
|
|
- Fallback: if local setup fails and `memorySearch.fallback = "openai"`, we automatically switch to remote embeddings (`openai/text-embedding-3-small` unless overridden) and record the reason.
|
|
|
|
|
| Value | Behavior |
|
|
|
|
|
| ---------------- | --------------------------------------------------- |
|
|
|
|
|
| `auto` (default) | Include `Source: <path#line>` footer in snippets |
|
|
|
|
|
| `on` | Always include footer |
|
|
|
|
|
| `off` | Omit footer (path still passed to agent internally) |
|
|
|
|
|
|
|
|
|
|
## Custom OpenAI-compatible endpoint example
|
|
|
|
|
### Full QMD example
|
|
|
|
|
|
|
|
|
|
```json5
|
|
|
|
|
agents: {
|
|
|
|
|
defaults: {
|
|
|
|
|
memorySearch: {
|
|
|
|
|
provider: "openai",
|
|
|
|
|
model: "text-embedding-3-small",
|
|
|
|
|
remote: {
|
|
|
|
|
baseUrl: "https://api.example.com/v1/",
|
|
|
|
|
apiKey: "YOUR_REMOTE_API_KEY",
|
|
|
|
|
headers: {
|
|
|
|
|
"X-Organization": "org-id",
|
|
|
|
|
"X-Project": "project-id"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
{
|
|
|
|
|
memory: {
|
|
|
|
|
backend: "qmd",
|
|
|
|
|
citations: "auto",
|
|
|
|
|
qmd: {
|
|
|
|
|
includeDefaultMemory: true,
|
|
|
|
|
update: { interval: "5m", debounceMs: 15000 },
|
|
|
|
|
limits: { maxResults: 6, timeoutMs: 4000 },
|
|
|
|
|
scope: {
|
|
|
|
|
default: "deny",
|
|
|
|
|
rules: [{ action: "allow", match: { chatType: "direct" } }],
|
|
|
|
|
},
|
|
|
|
|
paths: [{ name: "docs", path: "~/notes", pattern: "**/*.md" }],
|
|
|
|
|
},
|
|
|
|
|
},
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Notes:
|
|
|
|
|
|
|
|
|
|
- `remote.*` takes precedence over `models.providers.openai.*`.
|
|
|
|
|
- `remote.headers` merge with OpenAI headers; remote wins on key conflicts. Omit `remote.headers` to use the OpenAI defaults.
|
|
|
|
|
|