Commit Graph

47013 Commits

Author SHA1 Message Date
Peter Steinberger
1117964ed6 fix: keep music reference fetch timeout scoped 2026-05-11 17:55:56 +01:00
Peter Steinberger
234ebf434f fix: honor configured media timeouts 2026-05-11 17:55:56 +01:00
Shakker
d9491a8a98 test: check auto-reply missing files 2026-05-11 17:54:19 +01:00
Peter Steinberger
177ff5baba test: wait for agent dispatch by assertion 2026-05-11 17:53:47 +01:00
Shakker
f2ab8ef2f2 test: verify infra outbound payloads 2026-05-11 17:52:15 +01:00
Peter Steinberger
d092bf0b17 test: remove discord model picker polling loop 2026-05-11 17:51:37 +01:00
Shakker
cb0c757c83 test: spell out gateway warning output 2026-05-11 17:50:28 +01:00
Peter Steinberger
7b8125989b test: remove qa lab lazy catalog sleep 2026-05-11 17:50:02 +01:00
Peter Steinberger
d65153988d test: remove tool search pending sleep 2026-05-11 17:48:49 +01:00
Peter Steinberger
098469a2a1 test: remove codex pending-state sleeps 2026-05-11 17:47:41 +01:00
Shakker
5b2abf4fba test: spell out doctor preview warnings 2026-05-11 17:46:38 +01:00
Shakker
8412d89544 test: check daemon env warnings 2026-05-11 17:45:31 +01:00
Shakker
a66531e1c2 test: require channel command messages 2026-05-11 17:44:37 +01:00
Shakker
0977067614 test: spell out doctor repair output 2026-05-11 17:43:06 +01:00
Peter Steinberger
cde42bb49c test: remove managed image cache sleep 2026-05-11 17:42:48 +01:00
Peter Steinberger
85255d9906 test: remove ui zero-delay timers 2026-05-11 17:40:33 +01:00
Shakker
bb8a379c55 test: name gateway lifecycle payloads 2026-05-11 17:40:16 +01:00
Shakker
53b1b145f7 test: spell out channel status payloads 2026-05-11 17:38:42 +01:00
Shakker
f95b38baec test: describe cron wake errors 2026-05-11 17:37:35 +01:00
Shakker
cfb516b666 test: verify cli and daemon call shapes 2026-05-11 17:36:20 +01:00
Peter Steinberger
c0b088c08a test: clean up context warmup timing 2026-05-11 17:35:11 +01:00
Shakker
47d02e9bfa test: verify sdk and tool call values 2026-05-11 17:31:59 +01:00
Peter Steinberger
236a36847e test: remove google meet idle pull sleeps 2026-05-11 17:29:59 +01:00
Shakker
cff8e32aa4 test: verify doctor and identity output 2026-05-11 17:29:32 +01:00
Frank Yang
678b2510b2 fix: abort generic no-progress tool loops
Abort generic repeated no-progress tool loops at the configured critical threshold when identical calls keep returning identical outcomes.

Prepared head SHA: 7fa287cd0f
2026-05-12 00:29:10 +08:00
Peter Steinberger
af0b775274 test(gateway): canonicalize reset hook archive path 2026-05-11 17:28:23 +01:00
Peter Steinberger
980dfeaf02 fix(gateway): start shutdown session end emissions concurrently 2026-05-11 17:28:23 +01:00
Peter Steinberger
4f7606f2cc fix(gateway): narrow reply session end reason 2026-05-11 17:28:23 +01:00
pandadev66
376c7aea7f fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790)
Tighten the shutdown finalizer so it actually waits for plugin handlers
under its bounded budget and so it covers every session lifecycle path,
not just the centralized emitters in `session-reset-service.ts`.

- `drainActiveSessionsForShutdown` previously called
  `emitGatewaySessionEndPluginHook`, which fires `runSessionEnd` as
  fire-and-forget (`void hookRunner.runSessionEnd(...)`). The bounded
  2 s timeout then raced only the synchronous for-loop, so the close
  handler could proceed to subsystem teardown while a database-writing
  `session_end` plugin was still in flight -- the exact ghost-session
  failure this PR is supposed to fix. Inline the emit path: build the
  `buildSessionEndHookPayload` + `resolveStableSessionEndTranscript`
  payload directly in the drain and `await hookRunner.runSessionEnd(...)`
  under the bounded race. A never-resolving handler now surfaces as
  `timedOut=true` and the close handler records `session-end-drain` as
  a warning, but is never blocked.
- The channel reply path in `src/auto-reply/reply/session.ts` and the
  compaction lifecycle helper in `src/auto-reply/reply/session-updates.ts`
  emit `session_start` / `session_end` directly through the global hook
  runner without going through `emitGatewaySessionStartPluginHook`, so
  the shutdown tracker never saw normal channel sessions or rolled-over
  compacted sessions. Wire the tracker `note` / `forget` calls into both
  paths so every public lifecycle emitter participates in the same
  tracker, and so a compacted session is both forgotten (previous id)
  and re-noted (new id) on rollover.

Tests:

- `src/gateway/drain-active-sessions-for-shutdown.test.ts` gains two
  cases: one proves the drain genuinely waits for an in-flight handler
  to settle before returning, the other proves a never-resolving handler
  is cut off at the configured budget with `timedOut=true`.

Refs #57790.
2026-05-11 17:28:23 +01:00
pandadev66
dfa1a11676 fix(gateway): fire typed session_end on shutdown/restart for active sessions (#57790)
`session_end` was only fired when a session was replaced, reset, deleted, or
compacted -- the gateway shutdown/restart paths closed the process without
enumerating active sessions, so downstream `session_end` plugins
(e.g. claude-mem) accumulated ghost rows in `active` state across restarts.
Issue reporter saw 11 orphaned sessions cause 63 timeouts/day from agent
pool exhaustion.

Add an in-memory active-session tracker
(`src/gateway/active-sessions-shutdown-tracker.ts`) populated by
`emitGatewaySessionStartPluginHook` and forgotten unconditionally by
`emitGatewaySessionEndPluginHook` (even when no plugin listens), so any
session that has already been finalized through the normal lifecycle is
never re-fired by the shutdown drain. The close handler then calls a new
`drainActiveSessionsForShutdown({ reason })` in `session-reset-service.ts`
between the `gateway:shutdown`/`gateway:pre-restart` lifecycle hooks and
the subsystem teardown steps; the drain races a bounded 2 s total timeout
so a slow plugin cannot block SIGTERM/SIGINT, surfacing the timeout as a
`session-end-drain` warning on the shutdown result.

Extend `PluginHookSessionEndReason` with `"shutdown"` and `"restart"` so
plugins can distinguish a graceful close from a planned restart; the close
handler picks `restart` when `restartExpectedMs` is set and `shutdown`
otherwise. Update `emitGatewaySessionStartPluginHook` to also accept
`storePath`, `sessionFile`, and `agentId` so the shutdown drain can build
the same `session_end` payload shape the normal lifecycle path emits, and
update the existing call sites in `session-reset-service.ts` and
`server-methods/sessions.ts` to pass those fields through.

Tests:

- `src/gateway/active-sessions-shutdown-tracker.test.ts` (new) -- tracker
  insert/forget/clear semantics, idempotent re-noting, empty-id guard,
  snapshot isolation.
- `src/gateway/drain-active-sessions-for-shutdown.test.ts` (new) -- drain
  fires `session_end` with the right reason for every tracked session,
  skips sessions already finalized via reset/delete/compaction, and still
  forgets sessions even when no `session_end` plugin is registered.
- `src/gateway/server-close.test.ts` -- four new cases covering the
  shutdown/restart drain wiring, the bounded timeout warning, and the
  drain-skipped-when-no-helper case.

Docs:

- `docs/plugins/hooks.md` documents the new `shutdown`/`restart` values
  on `PluginHookSessionEndReason`.
- `docs/automation/hooks.md` documents the post-`gateway:shutdown`
  `session_end` drain step and its bounded execution guarantee.

Fixes #57790.
2026-05-11 17:28:23 +01:00
Shakker
8a8cb6fb30 test: verify command output text 2026-05-11 17:27:29 +01:00
Peter Steinberger
d0347f961c test: wait for qqbot queue drain 2026-05-11 17:26:41 +01:00
Peter Steinberger
c10bbddc2b test(tui): type session key assertion 2026-05-11 17:26:12 +01:00
leonaIee
503cb8d59f fix(auto-reply): keep silent turn errors visible 2026-05-11 17:26:12 +01:00
Peter Steinberger
2a6dfbd034 test: remove discord voice timer flushes 2026-05-11 17:24:33 +01:00
Peter Steinberger
15fa1e546f fix(cron): centralize wake session target resolution 2026-05-11 17:24:30 +01:00
Kaspre
f142bb0d6b test(extensions): type mocked calls explicitly 2026-05-11 17:24:30 +01:00
Kaspre
528ab7ed4d fix(wake): reject subagent session targets 2026-05-11 17:24:30 +01:00
Kaspre
5971f74bf1 fix(cron): strengthen targeted wake routing proof 2026-05-11 17:24:30 +01:00
Kaspre
db4f1fd778 fix(cron): preserve untargeted wake fanout 2026-05-11 17:24:30 +01:00
Kaspre
764bb7fbf7 test(gateway/cron): assert symmetric agentId derivation across enqueue and wake
When `cron.wake` is called with only an agent-prefixed `sessionKey` (no
explicit `agentId`), the gateway cron adapter must derive the same agentId
on both `enqueueSystemEvent` and `requestHeartbeat` so events land in (and
heartbeats fire on) the same agent target. Pre-PR, only `requestHeartbeat`
derived agentId from the key; `enqueueSystemEvent` ran through
`resolveCronSessionKey` with the configured-default agent and was rerouted
to that agent's main session under multi-agent deployments where `main`
exists but is not the default.

The new test exercises the cron-adapter directly via `state.cron.state.deps`
with a multi-agent config (`primary` default + `ops` non-default) and a
`agent:ops:cron:nightly:run:abc-123` foreign-agent session key, asserting
that both call sites resolve the agent target to "ops" rather than falling
back to "primary".

Refs #78687.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 17:24:30 +01:00
Kaspre
8399ff888f docs(system event): document --session-key timing exception
Codex review on PR #78687 [P3] flagged that the docs say next-heartbeat
"waits for the next scheduled tick" while the patched timer collapses
next-heartbeat+sessionKey to an immediate targeted wake. Add a callout
describing the exception and pointing callers who want delayed delivery
back at the no-session-key path.

Refs #78687.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 17:24:30 +01:00
Kaspre
fd8d58af05 fix(test): drop unused non-null assertions on mock.calls[0]
Caught by oxlint typescript-eslint(no-unnecessary-type-assertion) in CI.
mock.calls is typed as any[][], so the trailing `!` adds nothing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 17:24:30 +01:00
Kaspre
072fa9b174 fix(wake): handle relative + agent-prefixed session keys consistently in cron adapter
Address review findings from successive codex rounds:

1. next-heartbeat + sessionKey now fires a targeted immediate wake.
   The regularly-scheduled heartbeat fires for the agent's main session,
   not the supplied sessionKey, so an event queued for a non-main session
   would sit stranded indefinitely; an "event"-intent wake is also
   deferred as not-due by the heartbeat runner and not retried, so
   neither path delivers without an explicit immediate wake.

2. resolveCronWakeTarget now always runs through resolveCronAgent, both
   for agent-prefixed session keys (so non-default agents are honored)
   and relative keys (so the configured default agent is used instead
   of the hardcoded "main" returned by resolveAgentIdFromSessionKey).
   Mirrors the matching fix in the enqueueSystemEvent adapter so wake
   and enqueue resolve to the same target.

3. Generated Swift `WakeParams` models now expose the new optional
   `sessionkey` field (codingKey "sessionKey") in both the macOS and
   shared OpenClawKit copies. Locally regenerated from agent.ts via
   protocol:gen + protocol:gen:swift would have produced this; the
   environment couldn't run the generators (fs-safe transitive
   typecheck errors), so the diff was applied by hand to match what
   pnpm protocol:check would output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 17:24:30 +01:00
Kaspre
4ddd942f5f feat(wake): expose typed sessionKey on wake protocol + system event CLI
Adds an optional sessionKey to the WakeParamsSchema and threads it through
the gateway wake handler, CronService.wake(), and the underlying timer.wake()
ops so callers can target a specific session for async-task completion
relays instead of always hitting the agent's main session.

Also adds --session-key to `openclaw system event`.

The schema rejects empty/non-string sessionKey at the gateway boundary;
mismatched session keys (a key that does not belong to the resolving agent)
fall back to the agent's main session inside resolveCronSessionKey, which
is the existing safety path.

Refs #52305 (companion to PR #50818, which closes the related cron-run
remap slice at internal enqueue sites). Doesn't depend on #50818.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 17:24:30 +01:00
Shakker
13bc7037b1 test: verify infra generated values 2026-05-11 17:23:49 +01:00
Peter Steinberger
f57afbbd16 test: remove agent async timer flushes 2026-05-11 17:21:53 +01:00
Shakker
ac478b2c6a test: verify diagnostics and session callbacks 2026-05-11 17:21:03 +01:00
Peter Steinberger
1f43e79a58 test: remove plugin contract timer flushes 2026-05-11 17:19:44 +01:00
Shakker
d4f3d4edad test: verify schema and timing messages 2026-05-11 17:18:19 +01:00