Files
openclaw/docs/gateway/gateway-lock.md
Peter Steinberger c6ee68b751 Reapply "refactor: move runtime state to SQLite"
This reverts commit 694ca50e97.
2026-05-28 00:46:31 +01:00

2.4 KiB

summary, read_when, title
summary read_when title
Gateway singleton guard using SQLite leases and the WebSocket listener bind
Running or debugging the gateway process
Investigating single-instance enforcement
Gateway lock

Why

  • Ensure only one gateway instance runs per base port on the same host; additional gateways must use isolated profiles and unique ports.
  • Survive crashes/SIGKILL without leaving stale singleton state.
  • Fail fast with a clear error when the control port is already occupied.

Mechanism

  • The gateway first acquires a per-config SQLite lease in state_leases under scope gateway_locks and probes the configured port for an existing listener.
  • If the recorded lease owner is gone, the port is free, or the lease is stale, startup reclaims the lease and continues.
  • The gateway then binds the HTTP/WebSocket listener (default ws://127.0.0.1:18789) using an exclusive TCP listener.
  • If the bind fails with EADDRINUSE, startup throws GatewayLockError("another gateway instance is already listening on ws://127.0.0.1:<port>").
  • On shutdown the gateway closes the HTTP/WebSocket server and releases the SQLite lease.

Error surface

  • If another process holds the port, startup throws GatewayLockError("another gateway instance is already listening on ws://127.0.0.1:<port>").
  • Other bind failures surface as GatewayLockError("failed to bind gateway socket on ws://127.0.0.1:<port>: …").

Operational notes

  • If the port is occupied by another process, the error is the same; free the port or choose another with openclaw gateway --port <port>.
  • Under a service supervisor, a new gateway process that sees an existing healthy /healthz responder leaves that process in control. On systemd, the duplicate starter exits with code 78 so the default RestartPreventExitStatus=78 stops Restart=always from looping on a lock or EADDRINUSE conflict. If the existing process never becomes healthy, retries are bounded and startup fails with a clear lock error instead of looping forever.
  • The macOS app still maintains its own lightweight PID guard before spawning the gateway; the runtime singleton guard is enforced by the SQLite lease plus HTTP/WebSocket bind.