* fix(reply): deliver final reply when queued follow-up claims session; scope dedupe to routed thread
Two core bugs caused composed replies to be silently dropped (no delivery,
no error) when a second message arrived in the same thread mid-run:
1. dispatch-from-config: ensureDispatchReplyOperation only kept the
dispatch-owned operation authoritative while it had no result. Once
runReplyAgent completed the operation to drain queued follow-ups, a
second same-thread inbound could claim the session and the first final
reply would try to re-acquire the lane instead of finishing delivery,
deadlocking behind the queued work. Keep the dispatch-owned operation
authoritative through final delivery.
2. reply-payloads-dedupe: messaging-tool reply dedupe compared only the
channel target, not the routed thread, so a send in one thread could
suppress a later reply in a different thread. Thread the routed thread
id through buildReplyPayloads + follow-up delivery and only fall back to
channel-only matching for providers without a thread-aware suppression
matcher when neither side carries thread evidence.
Adds regression tests; existing Telegram topic-suppression behavior is
preserved by gating the thread guard to providers lacking a plugin matcher.
* fix(reply): preserve threaded message delivery evidence
* fix(reply): dedupe final payloads by delivery route
* fix(slack): preserve native send thread evidence
* fix(reply): preserve explicit reply thread evidence
* fix(reply): align explicit reply route dedupe
* fix(reply): preserve delivery lane through final dispatch
* fix(mattermost): preserve threaded tool send routes
* chore(plugin-sdk): refresh API baseline
* fix(reply): align final delivery route dedupe
* fix(reply): gate followups on final delivery
* fix(reply): keep send receipts private
* fix(reply): infer implicit message provider
* fix(reply): align routed threading policy
* fix(reply): preserve queued delivery context
* fix(reply): hydrate queued system event routes
* fix(reply): hydrate queued execution routes
* fix(reply): scope final delivery barriers
* fix(slack): preserve DM target aliases
* fix(reply): mirror resolved source thread routes
* fix(mattermost): retain delayed delivery barrier
* fix(codex): separate message routing from tool policy
* fix(reply): consume normalized Slack DM targets once
* fix(slack): remove stale target alias
* style(reply): satisfy changed lint gates
* fix(mattermost): preserve explicit reply targets
* test: align Slack reply branch checks
* fix(reply): persist overflow summaries to admitted session
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
diagnostics.otel.captureContent.{toolInputs,toolOutputs} were documented
and config-wired but never produced any span content. Emit tool args and
results over the trusted private-data diagnostic channel (mirroring the
model-content path), and have the OTel exporter bound/redact/truncate them
before span export. Raw tool content never rides the public event bus.
Scope: core embedded-runner tool path (canonical producer). Codex
(async-batched) and Claude CLI remain follow-ups tracked by the issue.
Refs #77391
* feat(imessage): always-on inbound recovery, deprecate catchup
Replaces the opt-in catchup subsystem with always-on inbound replay
protection that brings iMessage in line with the other channels, and
fixes#89237 (stale backlog dispatched as fresh after bridge recovery).
- New inbound-dedupe.ts: persistent claimable GUID dedupe (claim/commit/
release) plus a stale-backlog age fence that suppresses live rows whose
send date is materially older than arrival (logged, never silent).
- monitor-provider: claim at ingestion, carry the exact claimed key on the
debouncer entry, commit on successful flush / release on dispatch failure
(per-unit so a coalesced bucket cannot strand a sibling claim). Keeps the
local startup since_rowid watermark so startup-window rows are not skipped.
- Deprecate catchup: delete catchup.ts + catchup-bridge.ts, remove the
channels.imessage.catchup schema, cursor migration, and config-guard nag.
Back-compat: strip the retired key before validation; new imessage doctor
contract reports + removes it on doctor --fix.
- Docs updated for the new recovery model.
Net -947 prod LOC.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(imessage): recover downtime messages via since_rowid replay
Builds downtime recovery on the new inbound dedupe instead of restoring the
old catchup subsystem. On startup the monitor passes the last dispatched rowid
(a persisted per-account cursor) to imsg watch.subscribe as since_rowid, so imsg
replays the messages that landed while the gateway was down, then tails live.
The GUID dedupe drops anything already handled, so no cursor/retry bookkeeping
is needed.
- recovery-cursor.ts: minimal persisted per-account lastDispatchedRowid.
- monitor-provider: since_rowid = cursor (capped to the most recent
IMESSAGE_RECOVERY_MAX_ROWS); split the age fence on the startup rowid boundary
so replayed rows (<= boundary) use the wider recovery window and live rows
(> boundary) keep the tight #89237 fence; advance the cursor on commit.
- Local only: remote SSH cliPath cannot read chat.db, so it tails from the
current rowid (suppress-and-move-on) as before.
Restores missed-message recovery that the catchup removal dropped, with no
config and a fraction of the old LOC.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): make recovery cursor advance failure- and suppression-safe
Addresses two cursor-state regressions in the downtime-recovery path:
- Failed replay rows could be skipped forever: a released (failed) row keeps
its dedupe claim for retry, but a later successful row in the same flush
advanced the cursor past it, so the next startup's since_rowid skipped it.
Hold a per-session floor at the lowest released rowid and never advance the
cursor past it.
- Suppressed live backlog could be re-delivered after a restart: a live row
suppressed under the tight live fence was not recorded, so after a restart it
fell under the wider recovery window (its rowid now below the new boundary)
and was delivered. Commit its dedupe key on suppression so the recovery
replay treats it as already handled.
Both caught by Codex autoreview. Adds regression tests for the floor and the
suppression record.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): bound the GUID-less replay key length
Hash the composite fallback key's variable parts (conversation, sender,
created_at, text) so the key is length-bounded regardless of message text.
The persistent dedupe store already hashes keys internally, so this was not a
live overflow, but the bounded key removes the dependency on that and keeps the
fallback fail-open. Flagged by autoreview.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): recover downtime messages on remote cliPath setups too
The since_rowid replay runs over the imsg RPC client, so driving it from the
persisted recovery cursor (not the local chat.db boundary) makes downtime
recovery work for remote SSH cliPath gateways — the topology the old RPC-based
catchup served and that the rowid-boundary-only version regressed. Local setups
keep the wider, capped recovery window via the chat.db boundary; remote uses the
live age-fence window. Flagged by autoreview.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): seed recovery cursor from retired catchup cursor on upgrade
A one-time, self-cleaning migration: when the recovery cursor is empty on the
first startup after upgrade, seed it from the retired imessage.catchup-cursors
lastSeenRowid and consume the legacy entry. Without this a user who had catchup
enabled would not replay messages missed across the upgrade restart. Flagged by
autoreview.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): preserve catchup recovery on upgrade
---------
Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Move Zalo hosted outbound media metadata and expiry into plugin state, add SDK chunked hosted media storage, and keep CI/type/lint gates green after rebase.
Allow persisted provider model entries to carry strict thinkingLevelMap values so Microsoft Foundry Entra onboarding can save generated reasoning model config. Closes#91011.
Add memory.qmd.rerank as an opt-out for QMD query reranking when searchMode is query.
When set to false, direct QMD query calls pass --no-rerank and the mcporter unified query tool receives rerank:false. Search and vsearch modes keep their existing behavior.
Refs #61834.