Files
openclaw/test/scripts
Masato Hoshino e845a26fd6 fix(agents): fail fast with attributable reason after MCP stdio session dies mid-run (#98738)
* fix(agents): fail fast with attributable reason after MCP stdio session dies mid-run

Wires MCP Client onclose/onerror during bundle-mcp session creation so a
crashed/exited server flips session.connected instead of staying stale.
Next tool/resource/prompt call throws a domain-specific 'is disconnected'
error immediately instead of surfacing the SDK's generic 'Not connected'.
A disconnected reused session is retired and rebuilt fresh on the next
catalog pass rather than reused, since the SDK chains onclose/onerror
cumulatively on repeat connect() and the stdio transport never clears its
read buffer on an unexpected exit.

* fix(agents): retire a reused MCP session that dies mid-refresh, not just pre-refresh

codex review found: the catch-path retirement in getCatalog()'s per-server
task only covered two cases (fresh session that never connected this pass,
and a non-reused session that failed for any reason) - a reused session
that was healthy when this pass started but disconnects mid-refresh (child
process dies between ensureSessionConnected() returning and
listAllToolsBestEffort() finishing) hit neither branch, so onclose flipped
connected=false but the dead session object stayed in the map until the
next catalog rebuild happened to notice it. Added the missing branch plus
a regression test that kills the child process mid tools/list on a reused
session and asserts the session is purged from the map within the same
refresh (not just fail-fast on tool calls, which already worked).

* fix(agents): retire a dead reused MCP session even across overlapping catalog generations

codex round-2 review found: the mid-refresh retirement branch still
skipped when sharedWithNewerGeneration was true, which protects a
still-alive session another generation is actively using - but once
onclose flips session.connected=false, the transport is dead for every
generation sharing that object, so that guard no longer applies.
Simplified to a single !session.connected branch that always retires
(retireSessionIfCurrent already no-ops safely if a newer generation
replaced the map entry), and dropped the now-write-only
connectedForCatalog local it replaced.

* test(scripts): allow MCP SDK callback suppressions

* fix(agents): retire closed MCP sessions

---------

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-07-01 22:55:15 -07:00
..
2026-06-04 20:49:50 -04:00
2026-06-04 20:49:50 -04:00
2026-06-04 20:49:50 -04:00
2026-06-04 20:49:50 -04:00