mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 05:20:43 +00:00
fix(docs): add active memory speed recommendations
This commit is contained in:
@@ -116,6 +116,91 @@ What this means:
|
||||
- `config.promptStyle: "balanced"` uses the default general-purpose prompt style for `recent` mode
|
||||
- active memory still runs only on eligible interactive persistent chat sessions
|
||||
|
||||
## Speed recommendations
|
||||
|
||||
The simplest setup is to leave `config.model` unset and let Active Memory use
|
||||
the same model you already use for normal replies. That is the safest default
|
||||
because it follows your existing provider, auth, and model preferences.
|
||||
|
||||
If you want Active Memory to feel faster, use a dedicated inference model
|
||||
instead of borrowing the main chat model.
|
||||
|
||||
Example fast-provider setup:
|
||||
|
||||
```json5
|
||||
models: {
|
||||
providers: {
|
||||
cerebras: {
|
||||
baseUrl: "https://api.cerebras.ai/v1",
|
||||
apiKey: "${CEREBRAS_API_KEY}",
|
||||
api: "openai-completions",
|
||||
models: [{ id: "gpt-oss-120b", name: "GPT OSS 120B (Cerebras)" }],
|
||||
},
|
||||
},
|
||||
},
|
||||
plugins: {
|
||||
entries: {
|
||||
"active-memory": {
|
||||
enabled: true,
|
||||
config: {
|
||||
model: "cerebras/gpt-oss-120b",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Fast-model options worth considering:
|
||||
|
||||
- `cerebras/gpt-oss-120b` for a fast dedicated recall model with a narrow tool surface
|
||||
- your normal session model, by leaving `config.model` unset
|
||||
- a low-latency fallback model such as `google/gemini-3-flash` when you want a separate recall model without changing your primary chat model
|
||||
|
||||
Why Cerebras is a strong speed-oriented option for Active Memory:
|
||||
|
||||
- the Active Memory tool surface is narrow: it only calls `memory_search` and `memory_get`
|
||||
- recall quality matters, but latency matters more than for the main answer path
|
||||
- a dedicated fast provider avoids tying memory recall latency to your primary chat provider
|
||||
|
||||
If you do not want a separate speed-optimized model, leave `config.model` unset
|
||||
and let Active Memory inherit the current session model.
|
||||
|
||||
### Cerebras setup
|
||||
|
||||
Add a provider entry like this:
|
||||
|
||||
```json5
|
||||
models: {
|
||||
providers: {
|
||||
cerebras: {
|
||||
baseUrl: "https://api.cerebras.ai/v1",
|
||||
apiKey: "${CEREBRAS_API_KEY}",
|
||||
api: "openai-completions",
|
||||
models: [{ id: "gpt-oss-120b", name: "GPT OSS 120B (Cerebras)" }],
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Then point Active Memory at it:
|
||||
|
||||
```json5
|
||||
plugins: {
|
||||
entries: {
|
||||
"active-memory": {
|
||||
enabled: true,
|
||||
config: {
|
||||
model: "cerebras/gpt-oss-120b",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Caveat:
|
||||
|
||||
- make sure the Cerebras API key actually has model access for the model you choose, because `/v1/models` visibility alone does not guarantee `chat/completions` access
|
||||
|
||||
## How to see it
|
||||
|
||||
Active memory injects a hidden untrusted prompt prefix for the model. It does
|
||||
|
||||
Reference in New Issue
Block a user