Add isTransientSqliteError() covering SQLITE_CANTOPEN, SQLITE_BUSY,
SQLITE_LOCKED, and SQLITE_IOERR via named codes, numeric errcodes
(node:sqlite), and message-string fallback. Combine with existing
network transient check so both families are treated as non-fatal
in the global unhandled rejection handler.
Prevents crash loop under launchd on macOS when SQLite files are
temporarily unavailable.
Fixes#34678
* fix: canonicalize session keys at write time to prevent orphaned sessions (#29683)
resolveSessionKey() uses hardcoded DEFAULT_AGENT_ID="main", but all read
paths canonicalize via cfg. When the configured default agent differs
(e.g. "ops" with mainKey "work"), writes produce "agent:main:main" while
reads look up "agent:ops:work", orphaning transcripts on every restart.
Fix all three write-path call sites by wrapping with
canonicalizeMainSessionAlias:
- initSessionState (auto-reply/reply/session.ts)
- runWebHeartbeatOnce (web/auto-reply/heartbeat-runner.ts)
- resolveCronAgentSessionKey (cron/isolated-agent/session-key.ts)
Add startup migration (migrateOrphanedSessionKeys) to rename existing
orphaned keys to canonical form, merging by most-recent updatedAt.
* fix: address review — track agent IDs in migration map, align snapshot key
P1: migrateOrphanedSessionKeys now tracks agentId alongside each store
path in a Map instead of inferring from the filesystem path. This
correctly handles custom session.store templates outside the default
agents/<id>/ layout.
P2: Pass the already-canonicalized sessionKey to getSessionSnapshot so
the heartbeat snapshot reads/restores use the same key as the write path.
* fix: log migration results at all early return points
migrateOrphanedSessionKeys runs before detectLegacyStateMigrations, so
it can canonicalize legacy keys (e.g. "main" → "agent:main:main") before
the legacy detector sees them. This caused the early return path to skip
logging, breaking doctor-state-migrations tests that assert log.info was
called.
Extract logMigrationResults helper and call it at every return point.
* fix: handle shared stores and ~ expansion in migration
P1: When session.store has no {agentId}, all agents resolve to the same
file. Track all agentIds per store path (Map<path, Set<id>>) and run
canonicalization once per agent. Skip cross-agent "agent:main:*"
remapping when "main" is a legitimate configured agent sharing the store,
to avoid merging its data into another agent's namespace.
P2: Use expandHomePrefix (environment-aware ~ resolution) instead of
os.homedir() in resolveStorePathFromTemplate, matching the runtime
resolveStorePath behavior for OPENCLAW_HOME/HOME overrides.
* fix: narrow cross-agent remap to provable orphan aliases only
Only remap agent:main:* keys where the suffix is a main session alias
("main" or the configured mainKey). Other agent:main:* keys — hooks,
subagents, cron sessions, per-sender keys — may be intentional
cross-agent references and must not be silently moved into another
agent's namespace.
* fix: run orphan-key session migration at gateway startup (#29683)
* fix: canonicalize cross-agent legacy main aliases in session keys (#29683)
* fix: guard shared-store migration against cross-agent legacy alias remap (#29683)
* refactor: split session-key migration out of pr 30654
---------
Co-authored-by: Your Name <your_email@example.com>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
Add shared isSilentReplyPayloadText() detector that catches both bare
NO_REPLY tokens and JSON {"action":"NO_REPLY"} envelopes. Apply at the
reply directive parser, reply normalizer, and embedded agent payload
builder so the control payload is stripped before any channel sees it.
Preserves media when text is only a silent control envelope.
Fixes#37727