Files
openclaw/docs/concepts/memory-search.md

7.6 KiB

title, summary, read_when
title summary read_when
Memory Search How OpenClaw memory search works -- embedding providers, hybrid search, MMR, and temporal decay
You want to understand how memory_search retrieves results
You want to tune hybrid search, MMR, or temporal decay
You want to choose an embedding provider

Memory Search

OpenClaw indexes workspace memory files (MEMORY.md and memory/*.md) into chunks (~400 tokens, 80-token overlap) and searches them with memory_search. This page explains how the search pipeline works and how to tune it. For the file layout and memory basics, see Memory.

Search pipeline

Query -> Embedding -> Vector Search ─┐
                                     ├─> Weighted Merge -> Temporal Decay -> MMR -> Top-K
Query -> Tokenize  -> BM25 Search  ──┘

Both retrieval paths run in parallel when hybrid search is enabled. If either path is unavailable (no embeddings or no FTS5), the other runs alone.

Embedding providers

The default memory-core plugin ships built-in adapters for these providers:

Provider Adapter ID Auto-selected Notes
Local GGUF local Yes (first priority) node-llama-cpp, ~0.6 GB model
OpenAI openai Yes text-embedding-3-small default
Gemini gemini Yes Supports multimodal (images, audio)
Voyage voyage Yes
Mistral mistral Yes
Ollama ollama No (explicit only) Local/self-hosted

Auto-selection picks the first provider whose API key can be resolved. Set memorySearch.provider explicitly to override.

Remote embeddings require an API key for the embedding provider. OpenClaw resolves keys from auth profiles, models.providers.*.apiKey, or environment variables. Codex OAuth covers chat/completions only and does not satisfy embedding requests.

Quick start

Enable memory search with OpenAI embeddings:

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "openai",
        model: "text-embedding-3-small",
      },
    },
  },
}

Or use local embeddings (no API key needed):

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "local",
      },
    },
  },
}

Local mode uses node-llama-cpp and may require pnpm approve-builds to build the native addon.

Hybrid search (BM25 + vector)

When both FTS5 and embeddings are available, OpenClaw combines two retrieval signals:

  • Vector similarity -- semantic matching. Good at paraphrases ("Mac Studio gateway host" vs "the machine running the gateway").
  • BM25 keyword relevance -- exact token matching. Good at IDs, code symbols, error strings, and config keys.

How scores are merged

  1. Retrieve a candidate pool from each side (top maxResults x candidateMultiplier).
  2. Convert BM25 rank to a 0-1 score: textScore = 1 / (1 + max(0, bm25Rank)).
  3. Union candidates by chunk ID and compute: finalScore = vectorWeight x vectorScore + textWeight x textScore.

Weights are normalized to 1.0, so they behave as percentages. If either path is unavailable, the other runs alone with no hard failure.

CJK support

FTS5 uses configurable trigram tokenization with a short-substring fallback so Chinese, Japanese, and Korean text is searchable. CJK-heavy text is weighted correctly during chunk-size estimation, and surrogate-pair characters are preserved during fine splits.

Post-processing

After merging scores, two optional stages refine the result list:

Temporal decay (recency boost)

Daily notes accumulate over months. Without decay, a well-worded note from six months ago can outrank yesterday's update on the same topic.

Temporal decay applies an exponential multiplier based on age:

decayedScore = score x e^(-lambda x ageInDays)

With the default half-life of 30 days:

Age Score retained
Today 100%
7 days ~84%
30 days 50%
90 days 12.5%
180 days ~1.6%

Evergreen files are never decayed -- MEMORY.md and non-dated files in memory/ (like memory/projects.md) always rank at full score. Dated daily files use the date from the filename.

When to enable: Your agent has months of daily notes and stale information outranks recent context.

MMR re-ranking (diversity)

When search returns results, multiple chunks may contain similar or overlapping content. MMR (Maximal Marginal Relevance) re-ranks results to balance relevance with diversity.

How it works:

  1. Start with the highest-scoring result.
  2. Iteratively select the next result that maximizes: lambda x relevance - (1 - lambda) x max_similarity_to_already_selected.
  3. Similarity is measured using Jaccard text similarity on tokenized content.

The lambda parameter controls the trade-off:

  • 1.0 -- pure relevance (no diversity penalty).
  • 0.0 -- maximum diversity (ignores relevance).
  • Default: 0.7 (balanced, slight relevance bias).

When to enable: memory_search returns redundant or near-duplicate snippets, especially with daily notes that repeat similar information.

Configuration

Both post-processing features and hybrid search weights are configured under memorySearch.query.hybrid:

{
  agents: {
    defaults: {
      memorySearch: {
        query: {
          hybrid: {
            enabled: true,
            vectorWeight: 0.7,
            textWeight: 0.3,
            candidateMultiplier: 4,
            mmr: {
              enabled: true, // default: false
              lambda: 0.7,
            },
            temporalDecay: {
              enabled: true, // default: false
              halfLifeDays: 30,
            },
          },
        },
      },
    },
  },
}

You can enable either feature independently:

  • MMR only -- many similar notes but age does not matter.
  • Temporal decay only -- recency matters but results are already diverse.
  • Both -- recommended for agents with large, long-running daily note histories.

Session memory search (experimental)

You can optionally index session transcripts and surface them via memory_search. This is gated behind an experimental flag:

{
  agents: {
    defaults: {
      memorySearch: {
        experimental: { sessionMemory: true },
        sources: ["memory", "sessions"],
      },
    },
  },
}

Session indexing is opt-in and runs asynchronously. Results can be slightly stale until background sync finishes. Session logs live on disk, so treat filesystem access as the trust boundary.

Troubleshooting

memory_search returns nothing?

  • Check openclaw memory status -- is the index populated?
  • Verify an embedding provider is configured and has a valid key.
  • Run openclaw memory index --force to trigger a full reindex.

Results are all keyword matches, no semantic results?

  • Embeddings may not be configured. Check openclaw memory status --deep.
  • If using local, ensure node-llama-cpp built successfully.

CJK text not found?

  • FTS5 trigram tokenization handles CJK. If results are missing, run openclaw memory index --force to rebuild the FTS index.

Further reading