fix(delivery-queue): increment retryCount on deferred entries when time budget exceeded

When delivery recovery ran out of the 60s time budget, remaining pending
entries were silently deferred to the next restart with no retryCount
increment. This caused them to loop forever across restarts, never hitting
MAX_RETRIES and never moving to failed/.

Fix: call failDelivery() on each remaining entry before breaking out of
the recovery loop (both the deadline check and the backoff-exceeds-deadline
check). This increments retryCount so that entries eventually exhaust
MAX_RETRIES and are permanently skipped.

Fixes #24353
This commit is contained in:
Stephen Schoettler
2026-03-01 18:42:02 -08:00
committed by Peter Steinberger
parent 5e64265537
commit 4e92807f10

View File

@@ -344,8 +344,19 @@ export async function recoverPendingDeliveries(opts: {
for (const entry of pending) {
const now = Date.now();
if (now >= deadline) {
const deferred = pending.length - recovered - failed - skippedMaxRetries - deferredBackoff;
opts.log.warn(`Recovery time budget exceeded — ${deferred} entries deferred to next restart`);
// Increment retryCount on remaining entries so they eventually hit MAX_RETRIES
const remaining = pending.slice(pending.indexOf(entry));
for (const r of remaining) {
try {
await failDelivery(r.id, "Recovery time budget exceeded — deferred", opts.stateDir);
} catch {
/* best-effort */
}
}
const deferred = remaining.length;
opts.log.warn(
`Recovery time budget exceeded — ${deferred} entries deferred (retryCount incremented)`,
);
break;
}
if (entry.retryCount >= MAX_RETRIES) {