After a SIGUSR1 in-process restart following an npm upgrade from v2026.4.2
to v2026.4.5, the globalThis singleton created by the old code version
lacks the activeTaskWaiters field added in v2026.4.5. resolveGlobalSingleton
returns the stale object as-is, causing notifyActiveTaskWaiters() to call
Array.from(undefined) and crash the gateway in a loop.
Add a schema migration step in getQueueState() that patches the missing
field on legacy singleton objects. Add a regression test that plants a
v2026.4.2-shaped state object and verifies resetAllLanes() and
waitForActiveTasks() succeed without throwing.
Fixes#61905
- reject new lane enqueues once gateway drain begins
- always reset lane draining state and isolate onWait callback failures
- persist per-session abort cutoff and skip stale queued messages
- avoid false 600s agentTurn timeout in isolated cron jobs
Fixes#27407Fixes#27332Fixes#27427
Co-authored-by: Kevin Shenghui <shenghuikevin@github.com>
Co-authored-by: zjmy <zhangjunmengyang@gmail.com>
Co-authored-by: suko <miha.sukic@gmail.com>
Replace bare `new Error("Command lane cleared")` with a dedicated
`CommandLaneClearedError` class so callers that fire-and-forget
enqueued tasks can catch this specific type and avoid surfacing
unhandled rejection warnings.
clearCommandLane() was truncating the queue array without calling
resolve/reject on pending entries, causing never-settling promises
and memory leaks when upstream callers await enqueueCommandInLane().
Splice entries and reject each before clearing so callers can handle
the cancellation gracefully.
* fix(gateway): drain active turns before restart to prevent message loss
On SIGUSR1 restart, the gateway now waits up to 30s for in-flight agent
turns to complete before tearing down the server. This prevents buffered
messages from being dropped when config.patch or update triggers a restart
while agents are mid-turn.
Changes:
- command-queue.ts: add getActiveTaskCount() and waitForActiveTasks()
helpers to track and wait on active lane tasks
- run-loop.ts: on restart signal, drain active tasks before server.close()
with a 30s timeout; extend force-exit timer accordingly
- command-queue.test.ts: update imports for new exports
Fixes#13883
* fix(queue): snapshot active tasks for restart drain
---------
Co-authored-by: Elonito <0xRaini@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>