Two distinct user sends to a DM — a command followed by a pasted URL that
iMessage renders as a standalone URL-balloon message — are delivered by
Apple/BlueBubbles as two separate webhooks ~0.8-2.0 s apart. Without
coalescing, the agent replies to the command alone before the URL arrives
(the dump skill sees an empty payload), and the URL lands as a second
queued turn.
Opt-in behind channels.bluebubbles.coalesceSameSenderDms (default false).
When set, DM messages with no associatedMessageGuid hash to
dm:<chat>:<sender> so the debounce window merges text + URL-balloon into
one merged agent turn. Group chats and legacy text+balloon pairs linked
via associatedMessageGuid keep per-message keys. The default inbound
debounce window widens from 500 ms to 2500 ms when the flag is on and no
explicit messages.inbound.byChannel.bluebubbles is set, covering Apple's
observed split-send cadence.
Merged output is bounded (<=4000 chars text with an explicit truncation
marker, <=20 attachments, first-plus-latest sampling beyond 10 source
entries). Every source messageId folded into the merged view is committed
to the inbound dedupe store after processing, so a later BlueBubbles
MessagePoller replay of any individual source event is recognized as a
duplicate.
Tests (506 BlueBubbles tests passing, multiple new cases covering the
on/off matrix, control-command override, orphan URL-balloon, group-chat
preservation, default-window widening, replay-of-any-source-id dedupe,
and the 25-message flood bound).
Smoke tested end-to-end against live BlueBubbles webhook traffic:
Dump+URL composed as one iMessage -> BB dispatches two webhooks ~1 s
apart -> debouncer coalesces both under dm:<chat>:<sender> -> dump skill
runs on merged payload in one turn.
Also updates docs/channels/bluebubbles.md with a scenarios table,
enablement guide, trade-offs, and three-layer troubleshooting checklist;
docs/concepts/messages.md cross-references the control-command debounce
exception.
Regression: `costUsageCache` in `src/gateway/server-methods/usage.ts` had no
delete/prune/evict path. The TTL check at L310 only gates stale reads — on a
miss after expiry, `set()` overwrites the same key but never removes stale
keys. `parseDateRange` derives cacheKey from `getTodayStartMs`, so cacheKey
rolls at every UTC 00:00, and additional axes (days / startDate / endDate /
utcOffset) multiply cardinality. The macOS menu polls `usage.cost` every ~45s
with no params, exercising `parseDateRange`'s default branch every day. Over
gateway uptime the map grows monotonically.
Three sibling caches in the same subsystem already implement MAX + FIFO
eviction (resolvedSessionKeyByRunId, TRANSCRIPT_SESSION_KEY_CACHE,
sessionTitleFieldsCache). This change mirrors their pattern:
- `COST_USAGE_CACHE_MAX = 256` (matches RUN_LOOKUP_CACHE_LIMIT and
TRANSCRIPT_SESSION_KEY_CACHE_MAX).
- New `setCostUsageCache(cacheKey, entry)` helper checks size + evicts
`keys().next().value` when adding a new key would exceed the cap.
- The three existing `costUsageCache.set(...)` call sites now route through
the helper. TTL-on-read, in-flight dedup, and overwrite-on-same-key
semantics are preserved.
Adds `src/gateway/server-methods/usage.cost-usage-cache.test.ts` which drives
growth through `__test.loadCostUsageSummaryCached` with 600 distinct
(startMs, endMs) pairs (mirrors day rollover + range switches). Pre-fix the
Map grows to 600; post-fix it plateaus, the last key is retained, and the
first key is evicted (FIFO).
AI-assisted (fully tested). 432 server-methods tests pass, pnpm check +
pnpm build clean.