mirror of https://github.com/openclaw/openclaw.git synced 2026-04-12 17:51:22 +00:00

Files

Tak Hoffman b83726d13e Feat: Add Active Memory recall plugin (#63286 )

* Refine plugin debug plumbing

* Tighten plugin debug handling

* Reduce active memory overhead

* Abort active memory sidecar on timeout

* Rename active memory blocking subagent wording

* Fix active memory cache and recall selection

* Preserve active memory session scope

* Sanitize recalled context before retrieval

* Add active memory changelog entry

* Harden active memory debug and transcript handling

* Add active memory policy config

* Raise active memory timeout default

* Keep usage footer on primary reply

* Clear stale active memory status lines

* Match legacy active memory status prefixes

* Preserve numeric active memory bullets

* Reuse canonical session keys for active memory

* Let active memory subagent decide relevance

* Refine active memory plugin summary flow

* Fix active memory main-session DM detection

* Trim active memory summaries at word boundaries

* Add active memory prompt styles

* Fix active memory stale status cleanup

* Rename active memory subagent wording

* Add active memory prompt and thinking overrides

* Remove active memory legacy status compat

* Resolve active memory session id status

* Add active memory session toggle

* Add active memory global toggle

* Fix active memory toggle state handling

* Harden active memory transcript persistence

* Fix active memory chat type gating

* Scope active memory transcripts by agent

* Show plugin debug before replies

2026-04-09 11:27:37 -05:00

19 KiB

Raw Blame History

title, summary, read_when

title

summary

read_when

Active Memory

A plugin-owned blocking memory sub-agent that injects relevant memory into interactive chat sessions

You want to understand what active memory is for

You want to turn active memory on for a conversational agent

You want to tune active memory behavior without enabling it everywhere

Active Memory

Active memory is an optional plugin-owned blocking memory sub-agent that runs before the main reply for eligible conversational sessions.

It exists because most memory systems are capable but reactive. They rely on the main agent to decide when to search memory, or on the user to say things like "remember this" or "search memory." By then, the moment where memory would have made the reply feel natural has already passed.

Active memory gives the system one bounded chance to surface relevant memory before the main reply is generated.

Paste This Into Your Agent

Paste this into your agent if you want it to enable Active Memory with a self-contained, safe-default setup:

{
  plugins: {
    entries: {
      "active-memory": {
        enabled: true,
        config: {
          enabled: true,
          agents: ["main"],
          allowedChatTypes: ["direct"],
          modelFallbackPolicy: "default-remote",
          queryMode: "recent",
          promptStyle: "balanced",
          timeoutMs: 15000,
          maxSummaryChars: 220,
          persistTranscripts: false,
          logging: true,
        },
      },
    },
  },
}

This turns the plugin on for the main agent, keeps it limited to direct-message style sessions by default, lets it inherit the current session model first, and still allows the built-in remote fallback if no explicit or inherited model is available.

After that, restart the gateway:

node scripts/run-node.mjs gateway --profile dev

To inspect it live in a conversation:

/verbose on

Turn active memory on

The safest setup is:

enable the plugin
target one conversational agent
keep logging on only while tuning

Start with this in openclaw.json:

{
  plugins: {
    entries: {
      "active-memory": {
        enabled: true,
        config: {
          agents: ["main"],
          allowedChatTypes: ["direct"],
          modelFallbackPolicy: "default-remote",
          queryMode: "recent",
          promptStyle: "balanced",
          timeoutMs: 15000,
          maxSummaryChars: 220,
          persistTranscripts: false,
          logging: true,
        },
      },
    },
  },
}

Then restart the gateway:

node scripts/run-node.mjs gateway --profile dev

What this means:

plugins.entries.active-memory.enabled: true turns the plugin on
config.agents: ["main"] opts only the main agent into active memory
config.allowedChatTypes: ["direct"] keeps active memory on for direct-message style sessions only by default
if config.model is unset, active memory inherits the current session model first
config.modelFallbackPolicy: "default-remote" keeps the built-in remote fallback as the default when no explicit or inherited model is available
config.promptStyle: "balanced" uses the default general-purpose prompt style for recent mode
active memory still runs only on eligible interactive persistent chat sessions

How to see it

Active memory injects hidden system context for the model. It does not expose raw <active_memory_plugin>...</active_memory_plugin> tags to the client.

Session toggle

Use the plugin command when you want to pause or resume active memory for the current chat session without editing config:

/active-memory status
/active-memory off
/active-memory on

This is session-scoped. It does not change plugins.entries.active-memory.enabled, agent targeting, or other global configuration.

If you want the command to write config and pause or resume active memory for all sessions, use the explicit global form:

/active-memory status --global
/active-memory off --global
/active-memory on --global

The global form writes plugins.entries.active-memory.config.enabled. It leaves plugins.entries.active-memory.enabled on so the command remains available to turn active memory back on later.

If you want to see what active memory is doing in a live session, turn verbose mode on for that session:

/verbose on

With verbose enabled, OpenClaw can show:

an active memory status line such as Active Memory: ok 842ms recent 34 chars
a readable debug summary such as Active Memory Debug: Lemon pepper wings with blue cheese.

Those lines are derived from the same active memory pass that feeds the hidden system context, but they are formatted for humans instead of exposing raw prompt markup.

By default, the blocking memory sub-agent transcript is temporary and deleted after the run completes.

Example flow:

/verbose on
what wings should i order?

Expected visible reply shape:

...normal assistant reply...

🧩 Active Memory: ok 842ms recent 34 chars
🔎 Active Memory Debug: Lemon pepper wings with blue cheese.

When it runs

Active memory uses two gates:

Config opt-in The plugin must be enabled, and the current agent id must appear in plugins.entries.active-memory.config.agents.
Strict runtime eligibility Even when enabled and targeted, active memory only runs for eligible interactive persistent chat sessions.

The actual rule is:

plugin enabled
+
agent id targeted
+
allowed chat type
+
eligible interactive persistent chat session
=
active memory runs

If any of those fail, active memory does not run.

Session types

config.allowedChatTypes controls which kinds of conversations may run Active Memory at all.

The default is:

allowedChatTypes: ["direct"]

That means Active Memory runs by default in direct-message style sessions, but not in group or channel sessions unless you opt them in explicitly.

Examples:

allowedChatTypes: ["direct"]

allowedChatTypes: ["direct", "group"]

allowedChatTypes: ["direct", "group", "channel"]

Where it runs

Active memory is a conversational enrichment feature, not a platform-wide inference feature.

Surface	Runs active memory?
Control UI / web chat persistent sessions	Yes, if the plugin is enabled and the agent is targeted
Other interactive channel sessions on the same persistent chat path	Yes, if the plugin is enabled and the agent is targeted
Headless one-shot runs	No
Heartbeat/background runs	No
Generic internal `agent-command` paths	No
Sub-agent/internal helper execution	No

Why use it

Use active memory when:

the session is persistent and user-facing
the agent has meaningful long-term memory to search
continuity and personalization matter more than raw prompt determinism

It works especially well for:

stable preferences
recurring habits
long-term user context that should surface naturally

It is a poor fit for:

automation
internal workers
one-shot API tasks
places where hidden personalization would be surprising

How it works

The runtime shape is:

flowchart LR
  U["User Message"] --> Q["Build Memory Query"]
  Q --> R["Active Memory Blocking Memory Sub-Agent"]
  R -->|NONE or empty| M["Main Reply"]
  R -->|relevant summary| I["Append Hidden active_memory_plugin System Context"]
  I --> M["Main Reply"]

The blocking memory sub-agent can use only:

memory_search
memory_get

If the connection is weak, it should return NONE.

Query modes

config.queryMode controls how much conversation the blocking memory sub-agent sees.

Prompt styles

config.promptStyle controls how eager or strict the blocking memory sub-agent is when deciding whether to return memory.

Available styles:

balanced: general-purpose default for recent mode
strict: least eager; best when you want very little bleed from nearby context
contextual: most continuity-friendly; best when conversation history should matter more
recall-heavy: more willing to surface memory on softer but still plausible matches
precision-heavy: aggressively prefers NONE unless the match is obvious
preference-only: optimized for favorites, habits, routines, taste, and recurring personal facts

Default mapping when config.promptStyle is unset:

message -> strict
recent -> balanced
full -> contextual

If you set config.promptStyle explicitly, that override wins.

Example:

promptStyle: "preference-only"

Model fallback policy

If config.model is unset, Active Memory tries to resolve a model in this order:

explicit plugin model
-> current session model
-> agent primary model
-> optional built-in remote fallback

config.modelFallbackPolicy controls the last step.

Default:

modelFallbackPolicy: "default-remote"

Other option:

modelFallbackPolicy: "resolved-only"

Use resolved-only if you want Active Memory to skip recall instead of falling back to the built-in remote default when no explicit or inherited model is available.

Advanced escape hatches

These options are intentionally not part of the recommended setup.

config.thinking can override the blocking memory sub-agent thinking level:

thinking: "medium"

Default:

thinking: "off"

Do not enable this by default. Active Memory runs in the reply path, so extra thinking time directly increases user-visible latency.

config.promptAppend adds extra operator instructions after the default Active Memory prompt and before the conversation context:

promptAppend: "Prefer stable long-term preferences over one-off events."

config.promptOverride replaces the default Active Memory prompt. OpenClaw still appends the conversation context afterward:

promptOverride: "You are a memory search agent. Return NONE or one compact user fact."

Prompt customization is not recommended unless you are deliberately testing a different recall contract. The default prompt is tuned to return either NONE or compact user-fact context for the main model.

`message`

Only the latest user message is sent.

Latest user message only

Use this when:

you want the fastest behavior
you want the strongest bias toward stable preference recall
follow-up turns do not need conversational context

Recommended timeout:

start around 3000 to 5000 ms

`recent`

The latest user message plus a small recent conversational tail is sent.

Recent conversation tail:
user: ...
assistant: ...
user: ...

Latest user message:
...

Use this when:

you want a better balance of speed and conversational grounding
follow-up questions often depend on the last few turns

Recommended timeout:

start around 15000 ms

`full`

The full conversation is sent to the blocking memory sub-agent.

Full conversation context:
user: ...
assistant: ...
user: ...
...

Use this when:

the strongest recall quality matters more than latency
the conversation contains important setup far back in the thread

Recommended timeout:

increase it substantially compared with message or recent
start around 15000 ms or higher depending on thread size

In general, timeout should increase with context size:

message < recent < full

Transcript persistence

Active memory blocking memory sub-agent runs create a real session.jsonl transcript during the blocking memory sub-agent call.

By default, that transcript is temporary:

it is written to a temp directory
it is used only for the blocking memory sub-agent run
it is deleted immediately after the run finishes

If you want to keep those blocking memory sub-agent transcripts on disk for debugging or inspection, turn persistence on explicitly:

{
  plugins: {
    entries: {
      "active-memory": {
        enabled: true,
        config: {
          agents: ["main"],
          persistTranscripts: true,
          transcriptDir: "active-memory",
        },
      },
    },
  },
}

When enabled, active memory stores transcripts in a separate directory under the target agent's sessions folder, not in the main user conversation transcript path.

The default layout is conceptually:

agents/<agent>/sessions/active-memory/<blocking-memory-sub-agent-session-id>.jsonl

You can change the relative subdirectory with config.transcriptDir.

Use this carefully:

blocking memory sub-agent transcripts can accumulate quickly on busy sessions
full query mode can duplicate a lot of conversation context
these transcripts contain hidden prompt context and recalled memories

Configuration

All active memory configuration lives under:

plugins.entries.active-memory

The most important fields are:

Key	Type	Meaning
`enabled`	`boolean`	Enables the plugin itself
`config.agents`	`string[]`	Agent ids that may use active memory
`config.model`	`string`	Optional blocking memory sub-agent model ref; when unset, active memory uses the current session model
`config.queryMode`	`"message" \| "recent" \| "full"`	Controls how much conversation the blocking memory sub-agent sees
`config.promptStyle`	`"balanced" \| "strict" \| "contextual" \| "recall-heavy" \| "precision-heavy" \| "preference-only"`	Controls how eager or strict the blocking memory sub-agent is when deciding whether to return memory
`config.thinking`	`"off" \| "minimal" \| "low" \| "medium" \| "high" \| "xhigh" \| "adaptive"`	Advanced thinking override for the blocking memory sub-agent; default `off` for speed
`config.promptOverride`	`string`	Advanced full prompt replacement; not recommended for normal use
`config.promptAppend`	`string`	Advanced extra instructions appended to the default or overridden prompt
`config.timeoutMs`	`number`	Hard timeout for the blocking memory sub-agent
`config.maxSummaryChars`	`number`	Maximum total characters allowed in the active-memory summary
`config.logging`	`boolean`	Emits active memory logs while tuning
`config.persistTranscripts`	`boolean`	Keeps blocking memory sub-agent transcripts on disk instead of deleting temp files
`config.transcriptDir`	`string`	Relative blocking memory sub-agent transcript directory under the agent sessions folder

Useful tuning fields:

Key	Type	Meaning
`config.maxSummaryChars`	`number`	Maximum total characters allowed in the active-memory summary
`config.recentUserTurns`	`number`	Prior user turns to include when `queryMode` is `recent`
`config.recentAssistantTurns`	`number`	Prior assistant turns to include when `queryMode` is `recent`
`config.recentUserChars`	`number`	Max chars per recent user turn
`config.recentAssistantChars`	`number`	Max chars per recent assistant turn
`config.cacheTtlMs`	`number`	Cache reuse for repeated identical queries

Recommended setup

Start with recent.

{
  plugins: {
    entries: {
      "active-memory": {
        enabled: true,
        config: {
          agents: ["main"],
          queryMode: "recent",
          promptStyle: "balanced",
          timeoutMs: 15000,
          maxSummaryChars: 220,
          logging: true,
        },
      },
    },
  },
}

If you want to inspect live behavior while tuning, use /verbose on in the session instead of looking for a separate active-memory debug command.

Then move to:

message if you want lower latency
full if you decide extra context is worth the slower blocking memory sub-agent

Debugging

If active memory is not showing up where you expect:

Confirm the plugin is enabled under plugins.entries.active-memory.enabled.
Confirm the current agent id is listed in config.agents.
Confirm you are testing through an interactive persistent chat session.
Turn on config.logging: true and watch the gateway logs.
Verify memory search itself works with openclaw memory status --deep.

If memory hits are noisy, tighten:

maxSummaryChars

If active memory is too slow:

lower queryMode
lower timeoutMs
reduce recent turn counts
reduce per-turn char caps

19 KiB Raw Blame History

Active Memory

Paste This Into Your Agent

Turn active memory on

How to see it

Session toggle

When it runs

Session types

Where it runs

Why use it

How it works

Query modes

Prompt styles

Model fallback policy

Advanced escape hatches

message

recent

full

Transcript persistence

Configuration

Recommended setup

Debugging

Related pages

19 KiB

Raw Blame History

`message`

`recent`

`full`