Files
openclaw/docs/concepts/streaming.md
Gustavo Madeira Santana 82c06e5604 Docs: clarify Slack streaming thread behavior
Clarify the canonical Slack streaming config keys and legacy migration notes
across the Slack docs and shared streaming concept docs.

Document that native Slack streaming and assistant thread status require a
reply thread, and call out the top-level DM fallback behavior.
2026-04-08 00:58:00 -04:00

7.2 KiB
Raw Blame History

summary, read_when, title
summary read_when title
Streaming + chunking behavior (block replies, channel preview streaming, mode mapping)
Explaining how streaming or chunking works on channels
Changing block streaming or channel chunking behavior
Debugging duplicate/early block replies or channel preview streaming
Streaming and Chunking

Streaming + chunking

OpenClaw has two separate streaming layers:

  • Block streaming (channels): emit completed blocks as the assistant writes. These are normal channel messages (not token deltas).
  • Preview streaming (Telegram/Discord/Slack): update a temporary preview message while generating.

There is no true token-delta streaming to channel messages today. Preview streaming is message-based (send + edits/appends).

Block streaming (channel messages)

Block streaming sends assistant output in coarse chunks as it becomes available.

Model output
  └─ text_delta/events
       ├─ (blockStreamingBreak=text_end)
       │    └─ chunker emits blocks as buffer grows
       └─ (blockStreamingBreak=message_end)
            └─ chunker flushes at message_end
                   └─ channel send (block replies)

Legend:

  • text_delta/events: model stream events (may be sparse for non-streaming models).
  • chunker: EmbeddedBlockChunker applying min/max bounds + break preference.
  • channel send: actual outbound messages (block replies).

Controls:

  • agents.defaults.blockStreamingDefault: "on"/"off" (default off).
  • Channel overrides: *.blockStreaming (and per-account variants) to force "on"/"off" per channel.
  • agents.defaults.blockStreamingBreak: "text_end" or "message_end".
  • agents.defaults.blockStreamingChunk: { minChars, maxChars, breakPreference? }.
  • agents.defaults.blockStreamingCoalesce: { minChars?, maxChars?, idleMs? } (merge streamed blocks before send).
  • Channel hard cap: *.textChunkLimit (e.g., channels.whatsapp.textChunkLimit).
  • Channel chunk mode: *.chunkMode (length default, newline splits on blank lines (paragraph boundaries) before length chunking).
  • Discord soft cap: channels.discord.maxLinesPerMessage (default 17) splits tall replies to avoid UI clipping.

Boundary semantics:

  • text_end: stream blocks as soon as chunker emits; flush on each text_end.
  • message_end: wait until assistant message finishes, then flush buffered output.

message_end still uses the chunker if the buffered text exceeds maxChars, so it can emit multiple chunks at the end.

Chunking algorithm (low/high bounds)

Block chunking is implemented by EmbeddedBlockChunker:

  • Low bound: dont emit until buffer >= minChars (unless forced).
  • High bound: prefer splits before maxChars; if forced, split at maxChars.
  • Break preference: paragraphnewlinesentencewhitespace → hard break.
  • Code fences: never split inside fences; when forced at maxChars, close + reopen the fence to keep Markdown valid.

maxChars is clamped to the channel textChunkLimit, so you cant exceed per-channel caps.

Coalescing (merge streamed blocks)

When block streaming is enabled, OpenClaw can merge consecutive block chunks before sending them out. This reduces “single-line spam” while still providing progressive output.

  • Coalescing waits for idle gaps (idleMs) before flushing.
  • Buffers are capped by maxChars and will flush if they exceed it.
  • minChars prevents tiny fragments from sending until enough text accumulates (final flush always sends remaining text).
  • Joiner is derived from blockStreamingChunk.breakPreference (paragraph\n\n, newline\n, sentence → space).
  • Channel overrides are available via *.blockStreamingCoalesce (including per-account configs).
  • Default coalesce minChars is bumped to 1500 for Signal/Slack/Discord unless overridden.

Human-like pacing between blocks

When block streaming is enabled, you can add a randomized pause between block replies (after the first block). This makes multi-bubble responses feel more natural.

  • Config: agents.defaults.humanDelay (override per agent via agents.list[].humanDelay).
  • Modes: off (default), natural (8002500ms), custom (minMs/maxMs).
  • Applies only to block replies, not final replies or tool summaries.

"Stream chunks or everything"

This maps to:

  • Stream chunks: blockStreamingDefault: "on" + blockStreamingBreak: "text_end" (emit as you go). Non-Telegram channels also need *.blockStreaming: true.
  • Stream everything at end: blockStreamingBreak: "message_end" (flush once, possibly multiple chunks if very long).
  • No block streaming: blockStreamingDefault: "off" (only final reply).

Channel note: Block streaming is off unless *.blockStreaming is explicitly set to true. Channels can stream a live preview (channels.<channel>.streaming) without block replies.

Config location reminder: the blockStreaming* defaults live under agents.defaults, not the root config.

Preview streaming modes

Canonical key: channels.<channel>.streaming

Modes:

  • off: disable preview streaming.
  • partial: single preview that is replaced with latest text.
  • block: preview updates in chunked/appended steps.
  • progress: progress/status preview during generation, final answer at completion.

Channel mapping

Channel off partial block progress
Telegram maps to partial
Discord maps to partial
Slack

Slack-only:

  • channels.slack.streaming.nativeTransport toggles Slack native streaming API calls when channels.slack.streaming.mode="partial" (default: true).
  • Slack native streaming and Slack assistant thread status require a reply thread target; top-level DMs do not show that thread-style preview.

Legacy key migration:

  • Telegram: streamMode + boolean streaming auto-migrate to streaming enum.
  • Discord: streamMode + boolean streaming auto-migrate to streaming enum.
  • Slack: streamMode auto-migrates to streaming.mode; boolean streaming auto-migrates to streaming.mode plus streaming.nativeTransport; legacy nativeStreaming auto-migrates to streaming.nativeTransport.

Runtime behavior

Telegram:

  • Uses sendMessage + editMessageText preview updates across DMs and group/topics.
  • Preview streaming is skipped when Telegram block streaming is explicitly enabled (to avoid double-streaming).
  • /reasoning stream can write reasoning to preview.

Discord:

  • Uses send + edit preview messages.
  • block mode uses draft chunking (draftChunk).
  • Preview streaming is skipped when Discord block streaming is explicitly enabled.

Slack:

  • partial can use Slack native streaming (chat.startStream/append/stop) when available.
  • block uses append-style draft previews.
  • progress uses status preview text, then final answer.
  • Messages — message lifecycle and delivery
  • Retry — retry behavior on delivery failure
  • Channels — per-channel streaming support