* fix(gateway): preserve runtime-backed health state
* fix(clownfish): address review for ghcrawl-207035-agentic-merge (1)
* fix(gateway): harden health snapshot exposure
Prune stale gateway control-plane rate-limit buckets, bound transcript-session lookup caching, clear agent event sequence state with run contexts, and clear node wake/nudge state on disconnect.\n\nVerified locally after rebasing onto main:\n\n- pnpm test src/gateway/control-plane-rate-limit.test.ts src/gateway/session-transcript-key.test.ts src/infra/agent-events.test.ts src/gateway/server-methods/nodes.invoke-wake.test.ts\n- pnpm check\n\nCo-authored-by: lml2468 <39320777+lml2468@users.noreply.github.com>
emitChatFinal frees buffers on clean run completion, and the
maintenance timer sweeps abortedRuns after ABORTED_RUN_TTL_MS. But
runs that get stuck (e.g. LLM timeout without triggering clean
lifecycle end) are never aborted and their string buffers persist
indefinitely. This is the direct trigger for the StringAdd_CheckNone
OOM crash reported in the issue.
Add a stale buffer sweep in the maintenance timer that cleans up
buffers, deltaSentAt, and deltaLastBroadcastLen for any run not
updated within ABORTED_RUN_TTL_MS, regardless of abort status.
Closes#51821