mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 11:10:45 +00:00
feat: add gateway stall diagnostics
This commit is contained in:
@@ -117,12 +117,19 @@ diagnostics are enabled. It is for operational facts, not content.
|
||||
The same diagnostic heartbeat records liveness samples when the Gateway keeps
|
||||
running but the Node.js event loop or CPU looks saturated. These
|
||||
`diagnostic.liveness.warning` events include event-loop delay, event-loop
|
||||
utilization, CPU-core ratio, and active/waiting/queued session counts. Idle
|
||||
samples stay in telemetry at `info` level. Liveness samples become Gateway
|
||||
warnings only when work is waiting or queued, or when active work overlaps with
|
||||
sustained event-loop delay. Transient max-delay spikes during otherwise healthy
|
||||
background work stay in debug logs. They do not restart the Gateway by
|
||||
themselves.
|
||||
utilization, CPU-core ratio, active/waiting/queued session counts, the current
|
||||
startup/runtime phase when known, recent phase spans, and bounded active/queued
|
||||
work labels. Idle samples stay in telemetry at `info` level. Liveness samples
|
||||
become Gateway warnings only when work is waiting or queued, or when active work
|
||||
overlaps with sustained event-loop delay. Transient max-delay spikes during
|
||||
otherwise healthy background work stay in debug logs. They do not restart the
|
||||
Gateway by themselves.
|
||||
|
||||
Startup phases also emit `diagnostic.phase.completed` events with wall-clock and
|
||||
CPU timing. Stalled embedded-run diagnostics mark `terminalProgressStale=true`
|
||||
when the last bridge progress looked terminal, such as a raw response item or
|
||||
response completion event, but the Gateway still considers the embedded run
|
||||
active.
|
||||
|
||||
Inspect the live recorder:
|
||||
|
||||
|
||||
@@ -89,6 +89,13 @@ OPENCLAW_RUN_NODE_CPU_PROF_DIR=.artifacts/cli-cpu pnpm openclaw status
|
||||
The source runner adds Node CPU profile flags and writes a `.cpuprofile` for the
|
||||
command. Use this before adding temporary instrumentation to command code.
|
||||
|
||||
For startup stalls that look like synchronous filesystem or module-loader work,
|
||||
add Node's sync I/O trace flag through the source runner:
|
||||
|
||||
```bash
|
||||
OPENCLAW_TRACE_SYNC_IO=1 pnpm openclaw gateway --force
|
||||
```
|
||||
|
||||
## Gateway watch mode
|
||||
|
||||
For fast iteration, run the gateway under the file watcher:
|
||||
|
||||
Reference in New Issue
Block a user