Bug 1 (high): replace fixed sleep 1 with caller-PID polling in both
kickstart and start-after-exit handoff modes. The helper now waits until
kill -0 $caller_pid fails before issuing launchctl kickstart -k.
Bug 2 (medium): gate enable+bootstrap fallback on isLaunchctlNotLoaded().
Only attempt re-registration when kickstart -k fails because the job is
absent; all other kickstart failures now re-throw the original error.
Follows up on 3c0fd3dffe.
Fixes#43311, #43406, #43035, #43049
On macOS, launchctl bootout permanently unloads the LaunchAgent plist.
Even with KeepAlive: true, launchd cannot respawn a service whose plist
has been removed from its registry. This left users with a dead gateway
requiring manual 'openclaw gateway install' to recover.
Affected trigger paths:
- openclaw gateway restart from an agent session (#43311)
- SIGTERM on config reload (#43406)
- Gateway self-restart via SIGTERM (#43035)
- Hot reload on channel config change (#43049)
Switch restartLaunchAgent() to launchctl kickstart -k, which force-kills
and restarts the service without unloading the plist. When the restart
originates from inside the launchd-managed process tree, delegate to a
new detached handoff helper (launchd-restart-handoff.ts) to avoid the
caller being killed mid-command. Self-restart paths in process-respawn.ts
now schedule the detached start-after-exit handoff before exiting instead
of relying on exit/KeepAlive timing.
Fixes#43311, #43406, #43035, #43049
The repair/recovery path had the same missing `enable` guard as
`restartLaunchAgent`. If launchd persists a "disabled" state after a
previous `bootout`, the `bootstrap` call in `repairLaunchAgentBootstrap`
fails silently, leaving the gateway unloaded in the recovery flow.
Add the same `enable` guard before `bootstrap` that was already applied
to `installLaunchAgent` and (in this PR) `restartLaunchAgent`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
restartLaunchAgent was missing the launchctl enable call that
installLaunchAgent already performs. launchd can persist a "disabled"
state after bootout, causing bootstrap to silently fail and leaving the
gateway unloaded until a manual reinstall.
Fixes#39211
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* daemon(systemd): fall back to machine user scope when user bus is missing
* test(systemd): cover machine scope fallback for user-bus errors
* test(systemd): reset execFile mock state across cases
* test(systemd): make machine-user fallback assertion portable
* fix(daemon): keep root sudo path on direct user scope
* test(systemd): cover sudo root user-scope behavior
* ci: use resolvable bun version in setup-node-env
* daemon(systemd): target sudo caller user scope
* test(systemd): cover sudo user scope commands
* infra(ports): fall back to ss when lsof missing
* test(ports): verify ss fallback listener detection
* cli(gateway): use probe fallback for restart health
* test(gateway): cover restart-health probe fallback
Address review feedback: versioned Homebrew formulas (node@22, node@20)
use keg-only paths where the stable symlink is at <prefix>/opt/<formula>/bin/node,
not <prefix>/bin/node. Updated resolveStableNodePath to:
1. Try <prefix>/opt/<formula>/bin/node first (works for both default + versioned)
2. Fall back to <prefix>/bin/node for the default "node" formula
3. Return the original Cellar path if neither stable path exists
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When `openclaw gateway install` runs under Homebrew Node, `process.execPath`
resolves to the versioned Cellar path (e.g. /opt/homebrew/Cellar/node/25.7.0/bin/node).
This path breaks when Homebrew upgrades Node, silently killing the gateway daemon.
Resolve Cellar paths to the stable Homebrew symlink (/opt/homebrew/bin/node)
which Homebrew updates automatically during upgrades.
Closes#32182
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
LaunchAgent plist hardcodes ThrottleInterval to 60 in src/daemon/launchd-plist.ts
That means every restart/install path that terminates the launchd-managed gateway gets delayed by launchd’s one-minute relaunch throttle. The CLI restart path in src/daemon/launchd.ts is doing the expected supervisor actions, but the plist policy makes those actions look hung.
In src/daemon/launchd-plist.ts:
- added LAUNCH_AGENT_THROTTLE_INTERVAL_SECONDS
- reduced the LaunchAgent ThrottleInterval from 60 to 1