Commit Graph

17835 Commits

Author SHA1 Message Date
Peter Steinberger
2b8828ea46 test: fix windows runtime and restart loop harnesses 2026-03-09 10:58:04 -07:00
Peter Steinberger
dbabaa0fb2 chore: update appcast for 2026.3.8-beta.1 2026-03-09 10:58:04 -07:00
Peter Steinberger
daf0ade96b chore: prepare 2026.3.8-beta.1 release 2026-03-09 10:58:04 -07:00
Peter Steinberger
6c11b4378a fix: normalize windows runtime shim executables 2026-03-09 10:58:04 -07:00
Peter Steinberger
ddc3a3fc71 test: fix Windows fake runtime bin fixtures 2026-03-09 10:58:04 -07:00
Peter Steinberger
56218dcc21 test: fix Node 24+ test runner and subagent registry mocks 2026-03-09 10:58:04 -07:00
Peter Steinberger
38068de8e9 docs: move 2026.3.8 entries back to unreleased 2026-03-09 10:58:04 -07:00
Peter Steinberger
83b453f48a chore: refresh secrets baseline 2026-03-09 10:58:04 -07:00
Peter Steinberger
98d52062e7 build: sync pnpm lockfile 2026-03-09 10:58:04 -07:00
Peter Steinberger
64910240b9 docs: reorder 2026.3.8 changelog by impact 2026-03-09 10:58:04 -07:00
Peter Steinberger
bd4d7f6137 refactor: flatten supervisor marker hints 2026-03-09 10:58:04 -07:00
Peter Steinberger
1b8f800487 refactor: split cron startup catch-up flow 2026-03-09 10:58:04 -07:00
Peter Steinberger
8cb688c44d refactor: extract telegram polling session 2026-03-09 10:58:04 -07:00
Peter Steinberger
9bde7ef39f build: update app deps except carbon 2026-03-09 10:58:04 -07:00
Peter Steinberger
3c4377651e fix: stagger missed cron jobs on restart (#18925) (thanks @rexlunae) 2026-03-09 10:58:04 -07:00
rexlunae
41a39085d3 fix(cron): stagger missed jobs on restart to prevent gateway overload
When the gateway restarts with many overdue cron jobs, they are now
executed with staggered delays to prevent overwhelming the gateway.

- Add missedJobStaggerMs config (default 5s between jobs)
- Add maxMissedJobsPerRestart limit (default 5 jobs immediately)
- Prioritize most overdue jobs by sorting by nextRunAtMs
- Reschedule deferred jobs to fire gradually via normal timer

Fixes #18892
2026-03-09 10:58:04 -07:00
Peter Steinberger
2220a58ff7 fix: abort telegram getupdates on shutdown (#23950) (thanks @Gkinthecodeland) 2026-03-09 10:58:04 -07:00
George Kalogirou
e383257552 fix(telegram): use manual signal forwarding to avoid cross-realm AbortSignal
AbortSignal.any() fails in Node.js when signals come from different module
contexts (grammY's internal signal vs local AbortController), producing:
"The signals[0] argument must be an instance of AbortSignal. Received an
instance of AbortSignal".

Replace with manual event forwarding that works across all realms.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 10:58:04 -07:00
George Kalogirou
fa6436eaf3 fix(telegram): abort in-flight getUpdates fetch on shutdown
When the gateway receives SIGTERM, runner.stop() stops the grammY polling
loop but does not abort the in-flight getUpdates HTTP request. That request
hangs for up to 30 seconds (the Telegram API timeout). If a new gateway
instance starts polling during that window, Telegram returns a 409 Conflict
error, causing message loss and requiring exponential backoff recovery.

This is especially problematic with service managers (launchd, systemd)
that restart the process immediately after SIGTERM.

Wire an AbortController into the fetch layer so every Telegram API request
(especially the long-polling getUpdates) aborts immediately on shutdown:

- bot.ts: Accept optional fetchAbortSignal in TelegramBotOptions; wrap
  the grammY fetch with AbortSignal.any() to merge the shutdown signal.
- monitor.ts: Create a per-iteration AbortController, pass its signal to
  createTelegramBot, and abort it from the SIGTERM handler, force-restart
  path, and finally block.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 10:58:04 -07:00
Peter Steinberger
1809b92f15 fix(skills): pin validated download roots 2026-03-09 10:58:04 -07:00
Peter Steinberger
4006e388e7 fix(node-host): bind bun and deno approval scripts 2026-03-09 10:58:04 -07:00
Peter Steinberger
b22d596e59 fix: detect launchd supervision via xpc service name (#20555) (thanks @dimat) 2026-03-09 10:58:04 -07:00
dimatu
38f192a29c fix(gateway): detect launchd supervision via XPC_SERVICE_NAME
On macOS, launchd sets XPC_SERVICE_NAME on managed processes but does
not set LAUNCH_JOB_LABEL or LAUNCH_JOB_NAME. Without checking
XPC_SERVICE_NAME, isLikelySupervisedProcess() returns false for
launchd-managed gateways, causing restartGatewayProcessWithFreshPid()
to fork a detached child instead of returning "supervised". The
detached child holds the gateway lock while launchd simultaneously
respawns the original process (KeepAlive=true), leading to an infinite
lock-timeout / restart loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 10:58:04 -07:00
merlin
95ed5781f6 fix: release gateway lock on restart failure + reply to Codex reviews
- Release gateway lock when in-process restart fails, so daemon
  restart/stop can still manage the process (Codex P2)
- P1 (env mismatch) already addressed: best-effort by design, documented
  in JSDoc
2026-03-09 10:58:04 -07:00
merlin
d275df82a2 fix: move config pre-flight before onNotLoaded in runServiceRestart (Codex P2)
The config check was positioned after onNotLoaded, which could send
SIGUSR1 to an unmanaged process before config was validated.
2026-03-09 10:58:04 -07:00
merlin
32f6501dbd fix: address bot review feedback on #35862
- Remove dead 'return false' in runServiceStart (Greptile)
- Include stack trace in run-loop crash guard error log (Greptile)
- Only catch startup errors on subsequent restarts, not initial start (Codex P1)
- Add JSDoc note about env var false positive edge case (Codex P1)
2026-03-09 10:58:03 -07:00
merlin
ce44b366db test: add runServiceStart config pre-flight tests (#35862)
Address Greptile review: add test coverage for runServiceStart path.
The error message copy-paste issue was already fixed in the DRY refactor
(uses params.serviceNoun instead of hardcoded 'restart').
2026-03-09 10:58:03 -07:00
merlin
2cb063b72e fix(gateway): catch startup failure in run loop to prevent process exit (#35862)
When an in-process restart (SIGUSR1) triggers a config-triggered restart
and the new config is invalid, params.start() throws and the while loop
exits, killing the process. On macOS this loses TCC permissions.

Wrap params.start() in try/catch: on failure, set server=null, log the
error, and wait for the next SIGUSR1 instead of crashing.
2026-03-09 10:58:03 -07:00
merlin
5d82c2cc89 fix(gateway): validate config before restart to prevent crash + macOS permission loss (#35862)
When 'openclaw gateway restart' is run with an invalid config, the new
process crashes on startup due to config validation failure. On macOS,
this causes Full Disk Access (TCC) permissions to be lost because the
respawned process has a different PID.

Add getConfigValidationError() helper and pre-flight config validation
in both runServiceRestart() and runServiceStart(). If config is invalid,
abort with a clear error message instead of crashing.

The config watcher's hot-reload path already had this guard
(handleInvalidSnapshot), but the CLI restart/start commands did not.

AI-assisted (OpenClaw agent, fully tested)
2026-03-09 10:58:03 -07:00
Peter Steinberger
78a1644a8a fix(msteams): enforce sender allowlists with route allowlists 2026-03-09 10:58:03 -07:00
Peter Steinberger
d6b26e22d5 test(cron): cover owner-only tool availability 2026-03-09 10:58:03 -07:00
Peter Steinberger
0d607942d5 fix(cron): restore owner-only tools for isolated runs 2026-03-09 10:58:03 -07:00
Peter Steinberger
cc919d3856 fix(browser): enforce redirect-hop SSRF checks 2026-03-09 10:58:03 -07:00
Peter Steinberger
8948ed8e33 fix: add changelog for restart timeout recovery (#40380) (thanks @dsantoreis) 2026-03-09 10:58:03 -07:00
DevMac
648213653d test(secrets): skip ACL-dependent runtime snapshot tests on windows 2026-03-09 10:58:03 -07:00
Daniel dos Santos Reis
cec7006abb fix(gateway): exit non-zero on restart shutdown timeout
When a config-change restart hits the force-exit timeout, exit with
code 1 instead of 0 so launchd/systemd treats it as a failure and
triggers a clean process restart. Stop-timeout stays at exit(0)
since graceful stops should not cause supervisor recovery.

Closes #36822
2026-03-09 10:58:03 -07:00
scoootscooob
2d8c8e7f26 fix(daemon): also enable LaunchAgent in repairLaunchAgentBootstrap
The repair/recovery path had the same missing `enable` guard as
`restartLaunchAgent`.  If launchd persists a "disabled" state after a
previous `bootout`, the `bootstrap` call in `repairLaunchAgentBootstrap`
fails silently, leaving the gateway unloaded in the recovery flow.

Add the same `enable` guard before `bootstrap` that was already applied
to `installLaunchAgent` and (in this PR) `restartLaunchAgent`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 10:58:03 -07:00
scoootscooob
e2c8c27c8c fix(daemon): enable LaunchAgent before bootstrap on restart
restartLaunchAgent was missing the launchctl enable call that
installLaunchAgent already performs. launchd can persist a "disabled"
state after bootout, causing bootstrap to silently fail and leaving the
gateway unloaded until a manual reinstall.

Fixes #39211

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 10:58:03 -07:00
Peter Steinberger
dad107630e test: fix windows secrets runtime ci 2026-03-09 10:58:03 -07:00
GazeKingNuWu
0d597ab800 fix: clear plugin discovery cache after plugin installation (openclaw#39752)
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: GazeKingNuWu <264914544+GazeKingNuWu@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-09 10:58:03 -07:00
Ayaan Zaidi
03bc43a503 Fix cron text announce delivery for Telegram targets (#40575)
Merged via squash.

Prepared head SHA: 54b1513c78
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus
2026-03-09 10:58:03 -07:00
Bronko
8e506634b1 fix(matrix): restore robust DM routing without the memberCount heuristic (#19736)
* fix(matrix): remove memberCount heuristic from DM detection

The memberCount === 2 check in isDirectMessage() misclassifies 2-person
group rooms (admin channels, monitoring rooms) as DMs, routing them to
the main session instead of their room-specific session.

Matrix already distinguishes DMs from groups at the protocol level via
m.direct account data and is_direct member state flags. Both are already
checked by client.dms.isDm() and hasDirectFlag(). The memberCount
heuristic only adds false positives for 2-person groups.

Move resolveMemberCount() below the protocol-level checks so it is only
reached for rooms not matched by m.direct or is_direct. This narrows its
role to diagnostic logging for confirmed group rooms.

Refs: #19739

* fix(matrix): add conservative fallback for broken DM flags

Some homeservers (notably Continuwuity) have broken m.direct account
data or never set is_direct on invite events. With the memberCount
heuristic removed, these DMs are no longer detected.

Add a conservative fallback that requires two signals before classifying
as DM: memberCount === 2 AND no explicit m.room.name. Group rooms almost
always have explicit names; DMs almost never do.

Error handling distinguishes M_NOT_FOUND (missing state event, expected
for unnamed rooms) from network/auth errors. Non-404 errors fall through
to group classification rather than guessing.

This is independently revertable — removing this commit restores pure
protocol-based detection without any heuristic fallback.

* fix(matrix): add parentPeer for DM room binding support

Add parentPeer to DM routes so conversations are bindable by room ID
while preserving DM trust semantics (secure 1:1, no group restrictions).

Suggested by @KirillShchetinin.

* fix(matrix): override DM detection for explicitly configured rooms

Builds on @robertcorreiro's config-driven approach from #9106.

Move resolveMatrixRoomConfig() before the DM check. If a room matches
a non-wildcard config entry (matchSource === "direct") and was
classified as DM, override the classification to group. This gives users
a deterministic escape hatch for misclassified rooms.

Wildcards are excluded from the override to avoid breaking DM routing
when a "*" catch-all exists. roomConfig is gated behind isRoom so DMs
never inherit group settings (skills, systemPrompt, autoReply).

This commit is independently droppable if the scope is too broad.

* test(matrix): add DM detection and config override tests

- 15 unit tests for direct.ts: all detection paths, priority order,
  M_NOT_FOUND vs network error handling, edge cases (whitespace names,
  API failures)
- 8 unit tests for rooms.ts: matchSource classification, wildcard
  safety for DM override, direct match priority over wildcard

* Changelog: note matrix DM routing follow-up

* fix(matrix): preserve DM fallback and room bindings

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-09 10:58:03 -07:00
Ayaan Zaidi
27d2ac8460 fix: dedupe inbound Telegram DM replies per agent (#40519)
Merged via squash.

Prepared head SHA: 6e235e7d1f
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus
2026-03-09 10:58:03 -07:00
Peter Steinberger
fbc8c19e52 build(protocol): sync generated swift models 2026-03-09 10:58:03 -07:00
Peter Steinberger
a7e5c7c18b fix(media): accept reader read result type 2026-03-09 10:58:03 -07:00
Peter Steinberger
def4b221d9 fix(agents): re-expose configured tools under restrictive profiles 2026-03-09 10:58:03 -07:00
Tak Hoffman
40542aba96 chore(acpx): move runtime test fixtures to test-utils (openclaw#40548)
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini
2026-03-09 10:58:03 -07:00
Ayaan Zaidi
f34b0e6c42 test: fix android talk config contract fixture 2026-03-09 10:58:03 -07:00
Kyle
84064d08d3 fix(plugin-sdk): remove remaining bundled plugin src imports (openclaw#39638)
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: Kyle <3477429+kyledh@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-09 10:58:03 -07:00
Kesku
022923be15 alphabetize web search providers (#40259)
Merged via squash.

Prepared head SHA: be6350e5ae
Co-authored-by: kesku <62210496+kesku@users.noreply.github.com>
Co-authored-by: obviyus <22031114+obviyus@users.noreply.github.com>
Reviewed-by: @obviyus
2026-03-09 10:58:03 -07:00