mirror of
https://github.com/openclaw/openclaw.git
synced 2026-06-28 01:53:33 +00:00
* fix(llm): collapse cumulative openai-responses message snapshots instead of concatenating Some openai-responses providers (observed: Bedrock Mantle with GPT-5.x reasoning enabled, confirmed server-side via raw curl) re-emit the assistant message as many cumulative snapshot items — each a prefix-superset of the previous one — instead of a single final message item. Both stream consumers appended one text block per item, so the final visible reply, transcript, and replay context repeated the answer once per snapshot (observed 49-80x). Treat a same-phase message item whose text extends the immediately preceding text block as a replacement: the prior block takes the longer text, the duplicate block is dropped, and the first item's signature is kept so replay and stream-item identity stay stable. Shrinking or identical adjacent snapshots are dropped. Any non-message output item (reasoning, tool call) is a real boundary that resets the collapse, so distinct post-tool messages and reasoning replay pairing are untouched, as are different-phase (commentary/final_answer) items. Applies to the agent transport stream, the shared LLM consumer, and completed-response backfill. Fixes #91959. Reported by @phoenixyy with server-side evidence from @DaiMingNJ. * test(llm): drop redundant stream drains from responses snapshot tests * fix(llm): collapse only strict snapshot extensions and keep newest item signature Address ClawSweeper P1 review findings on #92399: text-prefix relation alone was broader than the observed corruption. Equal or shrinking adjacent same-phase message items are now always kept as distinct blocks (the Responses protocol allows multiple message items per response — verified against the sibling Codex parser, codex-rs/codex-api/src/sse/ responses.rs, which emits every output_item.done message as an independent item). With extension-only collapse a false positive can only merge rendering of two messages; it can never remove text. The merged block now carries the newest item's signature instead of the first one's, so replay associates the final content with the item that actually produced it. * fix(llm): defer snapshot-candidate message blocks to keep the event lifecycle balanced Address the remaining ClawSweeper P1 on #92399: collapsing a snapshot used to pop a block whose text_start had already been emitted, leaving per-index stream subscribers tracking a phantom block. A message item that follows a finalized text block now defers its public block: no text_start is emitted and deltas are withheld until the item either diverges from the prior text (then the block opens and the withheld prefix replays as one delta) or completes. A collapsed snapshot therefore never starts a block — it only re-ends the prior index with grown content, the documented resend shape — and a distinct deferred item opens and closes its own block normally. No block is ever removed, so every text_start has exactly one matching text_end at a live index. Tests now assert the complete ordered event sequence for the collapse, distinct-item, and divergence cases in both consumers. * fix(llm): treat any non-message item as a collapse boundary in completed-response backfill The streaming consumer resets the snapshot-collapse anchor on every non-message output item ("any other item is a real boundary"), but the transport's completed-response backfill only dispatched message and function_call items, so a reasoning item between two strict-prefix message items did not reset the anchor and the later message could collapse across it — an asymmetry with the streaming path's documented invariant. Reset lastTextBlock for every non-message item in the backfill loop (one canonical place; the per-tool-call reset is now redundant and removed). Covered by a backfill reasoning-boundary regression test.