Commit Graph

25 Commits

Author SHA1 Message Date
openperf
e777a2b230 fix(process ): migrate legacy command-queue singleton missing activeTaskWaiters
After a SIGUSR1 in-process restart following an npm upgrade from v2026.4.2
to v2026.4.5, the globalThis singleton created by the old code version
lacks the activeTaskWaiters field added in v2026.4.5.  resolveGlobalSingleton
returns the stale object as-is, causing notifyActiveTaskWaiters() to call
Array.from(undefined) and crash the gateway in a loop.

Add a schema migration step in getQueueState() that patches the missing
field on legacy singleton objects.  Add a regression test that plants a
v2026.4.2-shaped state object and verifies resetAllLanes() and
waitForActiveTasks() succeed without throwing.

Fixes #61905
2026-04-06 15:41:14 +01:00
Peter Steinberger
e0580e6863 test: harden shared-worker runtime setup 2026-04-03 18:18:56 +01:00
Shakker
2fa3a09137 test: harden command queue timer cleanup 2026-04-04 01:07:28 +09:00
Peter Steinberger
6b6ddcd2a6 test: speed up core runtime suites 2026-03-31 02:25:02 +01:00
Vignesh Natarajan
4d54376483 Tests: stabilize shard-2 queue and channel state 2026-03-29 01:12:58 -07:00
Vignesh Natarajan
4b137da582 Test: harden command queue test isolation (CI chore) 2026-03-28 23:43:03 -07:00
Peter Steinberger
23f0486810 fix: stabilize plugin startup boundaries 2026-03-28 05:22:26 +00:00
Peter Steinberger
5fb7a1363f fix: stabilize full gate 2026-03-17 07:06:25 +00:00
Vincent Koc
4ca84acf24 fix(runtime): duplicate messages, share singleton state across bundled chunks (#43683)
* Tests: add fresh module import helper

* Process: share command queue runtime state

* Agents: share embedded run runtime state

* Reply: share followup queue runtime state

* Reply: share followup drain callback state

* Reply: share queued message dedupe state

* Reply: share inbound dedupe state

* Tests: cover shared command queue runtime state

* Tests: cover shared embedded run runtime state

* Tests: cover shared followup queue runtime state

* Tests: cover shared inbound dedupe state

* Tests: cover shared Slack thread participation state

* Slack: share sent thread participation state

* Tests: document fresh import helper

* Telegram: share draft stream runtime state

* Tests: cover shared Telegram draft stream state

* Telegram: share sent message cache state

* Tests: cover shared Telegram sent message cache

* Telegram: share thread binding runtime state

* Tests: cover shared Telegram thread binding state

* Tests: avoid duplicate shared queue reset

* refactor(runtime): centralize global singleton access

* refactor(runtime): preserve undefined global singleton values

* test(runtime): cover undefined global singleton values

---------

Co-authored-by: Nimrod Gutman <nimrod.gutman@gmail.com>
2026-03-12 14:59:27 -04:00
Peter Steinberger
c397a02c9a fix(queue): harden drain/abort/timeout race handling
- reject new lane enqueues once gateway drain begins
- always reset lane draining state and isolate onWait callback failures
- persist per-session abort cutoff and skip stale queued messages
- avoid false 600s agentTurn timeout in isolated cron jobs

Fixes #27407
Fixes #27332
Fixes #27427

Co-authored-by: Kevin Shenghui <shenghuikevin@github.com>
Co-authored-by: zjmy <zhangjunmengyang@gmail.com>
Co-authored-by: suko <miha.sukic@gmail.com>
2026-02-26 13:43:39 +01:00
Peter Steinberger
185fba1d22 refactor(agents): dedupe plugin hooks and test helpers 2026-02-22 07:44:57 +00:00
Peter Steinberger
a011361784 perf(test): remove timer callbacks in command queue tests 2026-02-18 21:53:57 +00:00
Peter Steinberger
8f079afb38 perf(test): remove timer usage in command queue ordering test 2026-02-18 17:46:39 +00:00
Peter Steinberger
c7bc94436b perf(test): fake queue timers and merge telegram reply-mode checks 2026-02-18 16:01:20 +00:00
Peter Steinberger
46278e22cf perf(test): trim telegram duplicates and queue wait delays 2026-02-18 04:22:59 +00:00
cpojer
03e6acd051 chore: Fix types in tests 28/N. 2026-02-17 14:32:18 +09:00
Peter Steinberger
5caf829d28 perf(test): trim duplicate gateway and auto-reply test overhead 2026-02-13 23:40:38 +00:00
Joseph Krug
4e9f933e88 fix: reset stale execution state after SIGUSR1 in-process restart (#15195)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 676f9ec451
Co-authored-by: joeykrug <5925937+joeykrug@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-13 15:30:09 -05:00
Yi LIU
a5ccfa57a8 refactor(process): use dedicated CommandLaneClearedError in clearCommandLane
Replace bare `new Error("Command lane cleared")` with a dedicated
`CommandLaneClearedError` class so callers that fire-and-forget
enqueued tasks can catch this specific type and avoid surfacing
unhandled rejection warnings.
2026-02-13 19:43:20 +01:00
Yi LIU
a49dd83b14 fix(process): reject pending promises when clearing command lane
clearCommandLane() was truncating the queue array without calling
resolve/reject on pending entries, causing never-settling promises
and memory leaks when upstream callers await enqueueCommandInLane().

Splice entries and reject each before clearing so callers can handle
the cancellation gracefully.
2026-02-13 19:43:20 +01:00
0xRain
acb9cbb898 fix(gateway): drain active turns before restart to prevent message loss (#13931)
* fix(gateway): drain active turns before restart to prevent message loss

On SIGUSR1 restart, the gateway now waits up to 30s for in-flight agent
turns to complete before tearing down the server. This prevents buffered
messages from being dropped when config.patch or update triggers a restart
while agents are mid-turn.

Changes:
- command-queue.ts: add getActiveTaskCount() and waitForActiveTasks()
  helpers to track and wait on active lane tasks
- run-loop.ts: on restart signal, drain active tasks before server.close()
  with a 30s timeout; extend force-exit timer accordingly
- command-queue.test.ts: update imports for new exports

Fixes #13883

* fix(queue): snapshot active tasks for restart drain

---------

Co-authored-by: Elonito <0xRaini@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-12 07:55:19 -06:00
Peter Steinberger
6734f2d71c fix: wire OTLP logs for diagnostics 2026-01-20 22:51:47 +00:00
Peter Steinberger
e5f677803f chore: format to 2-space and bump changelog 2025-11-26 00:53:53 +01:00
Peter Steinberger
38659f5d3e test: sync updated specs 2025-11-25 12:12:29 +01:00
Peter Steinberger
13be898c07 feat: serialize command auto-replies with queue 2025-11-25 04:40:49 +01:00