Files
openclaw/docs/concepts/memory-search.md

249 lines
7.6 KiB
Markdown

---
title: "Memory Search"
summary: "How OpenClaw memory search works -- embedding providers, hybrid search, MMR, and temporal decay"
read_when:
- You want to understand how memory_search retrieves results
- You want to tune hybrid search, MMR, or temporal decay
- You want to choose an embedding provider
---
# Memory Search
OpenClaw indexes workspace memory files (`MEMORY.md` and `memory/*.md`) into
chunks (~400 tokens, 80-token overlap) and searches them with `memory_search`.
This page explains how the search pipeline works and how to tune it. For the
file layout and memory basics, see [Memory](/concepts/memory).
## Search pipeline
```
Query -> Embedding -> Vector Search ─┐
├─> Weighted Merge -> Temporal Decay -> MMR -> Top-K
Query -> Tokenize -> BM25 Search ──┘
```
Both retrieval paths run in parallel when hybrid search is enabled. If either
path is unavailable (no embeddings or no FTS5), the other runs alone.
## Embedding providers
The default `memory-core` plugin ships built-in adapters for these providers:
| Provider | Adapter ID | Auto-selected | Notes |
| ---------- | ---------- | -------------------- | ----------------------------------- |
| Local GGUF | `local` | Yes (first priority) | node-llama-cpp, ~0.6 GB model |
| OpenAI | `openai` | Yes | `text-embedding-3-small` default |
| Gemini | `gemini` | Yes | Supports multimodal (images, audio) |
| Voyage | `voyage` | Yes | |
| Mistral | `mistral` | Yes | |
| Ollama | `ollama` | No (explicit only) | Local/self-hosted |
Auto-selection picks the first provider whose API key can be resolved. Set
`memorySearch.provider` explicitly to override.
Remote embeddings require an API key for the embedding provider. OpenClaw
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
variables. Codex OAuth covers chat/completions only and does not satisfy
embedding requests.
### Quick start
Enable memory search with OpenAI embeddings:
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
},
},
},
}
```
Or use local embeddings (no API key needed):
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "local",
},
},
},
}
```
Local mode uses node-llama-cpp and may require `pnpm approve-builds` to build
the native addon.
## Hybrid search (BM25 + vector)
When both FTS5 and embeddings are available, OpenClaw combines two retrieval
signals:
- **Vector similarity** -- semantic matching. Good at paraphrases ("Mac Studio
gateway host" vs "the machine running the gateway").
- **BM25 keyword relevance** -- exact token matching. Good at IDs, code symbols,
error strings, and config keys.
### How scores are merged
1. Retrieve a candidate pool from each side (top
`maxResults x candidateMultiplier`).
2. Convert BM25 rank to a 0-1 score: `textScore = 1 / (1 + max(0, bm25Rank))`.
3. Union candidates by chunk ID and compute:
`finalScore = vectorWeight x vectorScore + textWeight x textScore`.
Weights are normalized to 1.0, so they behave as percentages. If either path is
unavailable, the other runs alone with no hard failure.
### CJK support
FTS5 uses configurable trigram tokenization with a short-substring fallback so
Chinese, Japanese, and Korean text is searchable. CJK-heavy text is weighted
correctly during chunk-size estimation, and surrogate-pair characters are
preserved during fine splits.
## Post-processing
After merging scores, two optional stages refine the result list:
### Temporal decay (recency boost)
Daily notes accumulate over months. Without decay, a well-worded note from six
months ago can outrank yesterday's update on the same topic.
Temporal decay applies an exponential multiplier based on age:
```
decayedScore = score x e^(-lambda x ageInDays)
```
With the default half-life of 30 days:
| Age | Score retained |
| -------- | -------------- |
| Today | 100% |
| 7 days | ~84% |
| 30 days | 50% |
| 90 days | 12.5% |
| 180 days | ~1.6% |
**Evergreen files are never decayed** -- `MEMORY.md` and non-dated files in
`memory/` (like `memory/projects.md`) always rank at full score. Dated daily
files use the date from the filename.
**When to enable:** Your agent has months of daily notes and stale information
outranks recent context.
### MMR re-ranking (diversity)
When search returns results, multiple chunks may contain similar or overlapping
content. MMR (Maximal Marginal Relevance) re-ranks results to balance relevance
with diversity.
How it works:
1. Start with the highest-scoring result.
2. Iteratively select the next result that maximizes:
`lambda x relevance - (1 - lambda) x max_similarity_to_already_selected`.
3. Similarity is measured using Jaccard text similarity on tokenized content.
The `lambda` parameter controls the trade-off:
- `1.0` -- pure relevance (no diversity penalty).
- `0.0` -- maximum diversity (ignores relevance).
- Default: `0.7` (balanced, slight relevance bias).
**When to enable:** `memory_search` returns redundant or near-duplicate
snippets, especially with daily notes that repeat similar information.
## Configuration
Both post-processing features and hybrid search weights are configured under
`memorySearch.query.hybrid`:
```json5
{
agents: {
defaults: {
memorySearch: {
query: {
hybrid: {
enabled: true,
vectorWeight: 0.7,
textWeight: 0.3,
candidateMultiplier: 4,
mmr: {
enabled: true, // default: false
lambda: 0.7,
},
temporalDecay: {
enabled: true, // default: false
halfLifeDays: 30,
},
},
},
},
},
},
}
```
You can enable either feature independently:
- **MMR only** -- many similar notes but age does not matter.
- **Temporal decay only** -- recency matters but results are already diverse.
- **Both** -- recommended for agents with large, long-running daily note
histories.
## Session memory search (experimental)
You can optionally index session transcripts and surface them via
`memory_search`. This is gated behind an experimental flag:
```json5
{
agents: {
defaults: {
memorySearch: {
experimental: { sessionMemory: true },
sources: ["memory", "sessions"],
},
},
},
}
```
Session indexing is opt-in and runs asynchronously. Results can be slightly stale
until background sync finishes. Session logs live on disk, so treat filesystem
access as the trust boundary.
## Troubleshooting
**`memory_search` returns nothing?**
- Check `openclaw memory status` -- is the index populated?
- Verify an embedding provider is configured and has a valid key.
- Run `openclaw memory index --force` to trigger a full reindex.
**Results are all keyword matches, no semantic results?**
- Embeddings may not be configured. Check `openclaw memory status --deep`.
- If using `local`, ensure node-llama-cpp built successfully.
**CJK text not found?**
- FTS5 trigram tokenization handles CJK. If results are missing, run
`openclaw memory index --force` to rebuild the FTS index.
## Further reading
- [Memory](/concepts/memory) -- file layout, backends, tools
- [Memory configuration reference](/reference/memory-config) -- all config knobs
including QMD, batch indexing, embedding cache, sqlite-vec, and multimodal