mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-07 11:10:42 +00:00
124 lines
6.5 KiB
Markdown
124 lines
6.5 KiB
Markdown
---
|
|
summary: "Auto-reply queue modes, defaults, and per-session overrides"
|
|
read_when:
|
|
- Changing auto-reply execution or concurrency
|
|
- Explaining /queue modes or message steering behavior
|
|
title: "Command queue"
|
|
---
|
|
|
|
We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.
|
|
|
|
## Why
|
|
|
|
- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
|
|
- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.
|
|
|
|
## How it works
|
|
|
|
- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
|
|
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
|
|
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent`.
|
|
- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
|
|
- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.
|
|
|
|
## Defaults
|
|
|
|
When unset, all inbound channel surfaces use:
|
|
|
|
- `mode: "steer"`
|
|
- `debounceMs: 500`
|
|
- `cap: 20`
|
|
- `drop: "summarize"`
|
|
|
|
`steer` is the default because it keeps the active model turn responsive without
|
|
starting a second session run. It drains all steering messages that arrived
|
|
before the next model boundary. If the current run cannot accept steering,
|
|
OpenClaw falls back to a followup queue entry.
|
|
|
|
## Queue modes
|
|
|
|
Inbound messages can steer the current run, wait for a followup turn, or do both:
|
|
|
|
- `steer`: queue steering messages into the active runtime. Pi delivers all pending steering messages **after the current assistant turn finishes executing its tool calls**, before the next LLM call; Codex app-server receives one batched `turn/steer`. If the run is not actively streaming or steering is unavailable, OpenClaw falls back to a followup queue entry.
|
|
- `queue` (legacy): old one-at-a-time steering. Pi delivers one queued steering message at each model boundary; Codex app-server receives separate `turn/steer` requests. Prefer `steer` unless you need the previous serialized behavior.
|
|
- `followup`: enqueue each message for a later agent turn after the current run ends.
|
|
- `collect`: coalesce queued messages into a **single** followup turn after the quiet window. If messages target different channels/threads, they drain individually to preserve routing.
|
|
- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the same message for a followup turn.
|
|
- `interrupt` (legacy): abort the active run for that session, then run the newest message.
|
|
|
|
Steer-backlog means you can get a followup response after the steered run, so
|
|
streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
|
|
one response per inbound message.
|
|
|
|
For runtime-specific timing and dependency behavior, see
|
|
[Steering queue](/concepts/queue-steering).
|
|
|
|
Configure globally or per channel via `messages.queue`:
|
|
|
|
```json5
|
|
{
|
|
messages: {
|
|
queue: {
|
|
mode: "steer",
|
|
debounceMs: 500,
|
|
cap: 20,
|
|
drop: "summarize",
|
|
byChannel: { discord: "collect" },
|
|
},
|
|
},
|
|
}
|
|
```
|
|
|
|
## Queue options
|
|
|
|
Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` or legacy `queue` when steering falls back to followup):
|
|
|
|
- `debounceMs`: quiet window before draining queued followups. Bare numbers are milliseconds; units `ms`, `s`, `m`, `h`, and `d` are accepted by `/queue` options.
|
|
- `cap`: max queued messages per session. Values below `1` are ignored.
|
|
- `drop: "summarize"`: default. Drop the oldest queued entries as needed, keep compact summaries, and inject them as a synthetic followup prompt.
|
|
- `drop: "old"`: drop the oldest queued entries as needed, without preserving summaries.
|
|
- `drop: "new"`: reject the newest message when the queue is already full.
|
|
|
|
Defaults: `debounceMs: 500`, `cap: 20`, `drop: summarize`.
|
|
|
|
## Precedence
|
|
|
|
For mode selection, OpenClaw resolves:
|
|
|
|
1. Inline or stored per-session `/queue` override.
|
|
2. `messages.queue.byChannel.<channel>`.
|
|
3. `messages.queue.mode`.
|
|
4. Default `steer`.
|
|
|
|
For options, inline or stored `/queue` options win over config. Then
|
|
channel-specific debounce (`messages.queue.debounceMsByChannel`), plugin
|
|
debounce defaults, global `messages.queue` options, and built-in defaults are
|
|
applied. `cap` and `drop` are global/session options, not per-channel config
|
|
keys.
|
|
|
|
## Per-session overrides
|
|
|
|
- Send `/queue <mode>` as a standalone command to store the mode for the current session.
|
|
- Options can be combined: `/queue collect debounce:0.5s cap:25 drop:summarize`
|
|
- `/queue default` or `/queue reset` clears the session override.
|
|
|
|
## Scope and guarantees
|
|
|
|
- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
|
|
- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
|
|
- Additional lanes may exist (e.g. `cron`, `cron-nested`, `nested`, `subagent`) so background jobs can run in parallel without blocking inbound replies. Isolated cron agent turns hold a `cron` slot while their inner agent execution uses `cron-nested`; both use `cron.maxConcurrentRuns`. Shared non-cron `nested` flows keep their own lane behavior. These detached runs are tracked as [background tasks](/automation/tasks).
|
|
- Per-session lanes guarantee that only one agent run touches a given session at a time.
|
|
- No external dependencies or background worker threads; pure TypeScript + promises.
|
|
|
|
## Troubleshooting
|
|
|
|
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
|
|
- If you need queue depth, enable verbose logs and watch for queue timing lines.
|
|
- When diagnostics are enabled, sessions that remain in `processing` past `diagnostics.stuckSessionWarnMs` log a stuck-session warning. Active embedded runs, active reply operations, and active lane tasks remain warning-only by default; stale startup bookkeeping with no active session work can release the affected session lane so queued work drains.
|
|
|
|
## Related
|
|
|
|
- [Session management](/concepts/session)
|
|
- [Steering queue](/concepts/queue-steering)
|
|
- [Retry policy](/concepts/retry)
|