fix(docs): add active memory speed recommendations

This commit is contained in:
Tak Hoffman
2026-04-16 00:55:04 -05:00
parent 6f5459364a
commit b10ae0bf13

View File

@@ -116,6 +116,91 @@ What this means:
- `config.promptStyle: "balanced"` uses the default general-purpose prompt style for `recent` mode
- active memory still runs only on eligible interactive persistent chat sessions
## Speed recommendations
The simplest setup is to leave `config.model` unset and let Active Memory use
the same model you already use for normal replies. That is the safest default
because it follows your existing provider, auth, and model preferences.
If you want Active Memory to feel faster, use a dedicated inference model
instead of borrowing the main chat model.
Example fast-provider setup:
```json5
models: {
providers: {
cerebras: {
baseUrl: "https://api.cerebras.ai/v1",
apiKey: "${CEREBRAS_API_KEY}",
api: "openai-completions",
models: [{ id: "gpt-oss-120b", name: "GPT OSS 120B (Cerebras)" }],
},
},
},
plugins: {
entries: {
"active-memory": {
enabled: true,
config: {
model: "cerebras/gpt-oss-120b",
},
},
},
}
```
Fast-model options worth considering:
- `cerebras/gpt-oss-120b` for a fast dedicated recall model with a narrow tool surface
- your normal session model, by leaving `config.model` unset
- a low-latency fallback model such as `google/gemini-3-flash` when you want a separate recall model without changing your primary chat model
Why Cerebras is a strong speed-oriented option for Active Memory:
- the Active Memory tool surface is narrow: it only calls `memory_search` and `memory_get`
- recall quality matters, but latency matters more than for the main answer path
- a dedicated fast provider avoids tying memory recall latency to your primary chat provider
If you do not want a separate speed-optimized model, leave `config.model` unset
and let Active Memory inherit the current session model.
### Cerebras setup
Add a provider entry like this:
```json5
models: {
providers: {
cerebras: {
baseUrl: "https://api.cerebras.ai/v1",
apiKey: "${CEREBRAS_API_KEY}",
api: "openai-completions",
models: [{ id: "gpt-oss-120b", name: "GPT OSS 120B (Cerebras)" }],
},
},
}
```
Then point Active Memory at it:
```json5
plugins: {
entries: {
"active-memory": {
enabled: true,
config: {
model: "cerebras/gpt-oss-120b",
},
},
},
}
```
Caveat:
- make sure the Cerebras API key actually has model access for the model you choose, because `/v1/models` visibility alone does not guarantee `chat/completions` access
## How to see it
Active memory injects a hidden untrusted prompt prefix for the model. It does