mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 10:30:44 +00:00
Improve gateway diagnostics export for support reports (#70324)
Merged via squash.
Prepared head SHA: 3d6ee85993
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
This commit is contained in:
committed by
GitHub
parent
6b41ef311f
commit
28818f9140
@@ -1,4 +1,4 @@
|
||||
81a8a7de5d4bf02cf3e697a641fe89844f98ed58d47890f12800181fde5a97b1 config-baseline.json
|
||||
dab963eda8866b8bffd5c9032f92f0f6b08ed54dda837f1f5c513fca5d2c78e9 config-baseline.core.json
|
||||
b05357fa162ba1f1d4ed192671b758d3905602678ff61148568840c6544d6222 config-baseline.json
|
||||
a4e167f169db58d71c385a31fa2b980772f9fee963e70dd9553f63536cae5aed config-baseline.core.json
|
||||
35d132fe176bd2bf9f0e46b29de91baba63ec4db3317cc5b294a982b46d16ba9 config-baseline.channel.json
|
||||
71b5ff17041bc48a62300ad9f44fa8bb14d9dcd7f4c3549c0576d3059ce6ff36 config-baseline.plugin.json
|
||||
3703c5345288adb9eee8cda3b592147cf4fed25a7782bed21ca83c88c3ca1cc0 config-baseline.plugin.json
|
||||
|
||||
@@ -111,6 +111,59 @@ Options:
|
||||
|
||||
- `--days <days>`: number of days to include (default `30`).
|
||||
|
||||
### `gateway stability`
|
||||
|
||||
Fetch the recent diagnostic stability recorder from a running Gateway.
|
||||
|
||||
```bash
|
||||
openclaw gateway stability
|
||||
openclaw gateway stability --type payload.large
|
||||
openclaw gateway stability --bundle latest
|
||||
openclaw gateway stability --bundle latest --export
|
||||
openclaw gateway stability --json
|
||||
```
|
||||
|
||||
Options:
|
||||
|
||||
- `--limit <limit>`: maximum number of recent events to include (default `25`, max `1000`).
|
||||
- `--type <type>`: filter by diagnostic event type, such as `payload.large` or `diagnostic.memory.pressure`.
|
||||
- `--since-seq <seq>`: include only events after a diagnostic sequence number.
|
||||
- `--bundle [path]`: read a persisted stability bundle instead of calling the running Gateway. Use `--bundle latest` (or just `--bundle`) for the newest bundle under the state directory, or pass a bundle JSON path directly.
|
||||
- `--export`: write a shareable support diagnostics zip instead of printing stability details.
|
||||
- `--output <path>`: output path for `--export`.
|
||||
|
||||
Notes:
|
||||
|
||||
- The recorder is active by default. Set `diagnostics.enabled: false` only when you need to disable Gateway diagnostic heartbeat collection.
|
||||
- Records keep operational metadata: event names, counts, byte sizes, memory readings, queue/session state, channel/plugin names, and redacted session summaries. They do not keep chat text, webhook bodies, tool outputs, raw request or response bodies, tokens, cookies, secret values, hostnames, or raw session ids.
|
||||
- On fatal Gateway exits, shutdown timeouts, and restart startup failures, OpenClaw writes the same diagnostic snapshot to `~/.openclaw/logs/stability/openclaw-stability-*.json` when the recorder has events. Inspect the newest bundle with `openclaw gateway stability --bundle latest`; `--limit`, `--type`, and `--since-seq` also apply to bundle output.
|
||||
|
||||
### `gateway diagnostics export`
|
||||
|
||||
Write a local diagnostics zip that is designed to attach to bug reports.
|
||||
|
||||
```bash
|
||||
openclaw gateway diagnostics export
|
||||
openclaw gateway diagnostics export --output openclaw-diagnostics.zip
|
||||
openclaw gateway diagnostics export --json
|
||||
```
|
||||
|
||||
Options:
|
||||
|
||||
- `--output <path>`: output zip path. Defaults to a support export under the state directory.
|
||||
- `--log-lines <count>`: maximum sanitized log lines to include (default `5000`).
|
||||
- `--log-bytes <bytes>`: maximum log bytes to inspect (default `1000000`).
|
||||
- `--url <url>`: Gateway WebSocket URL for the health snapshot.
|
||||
- `--token <token>`: Gateway token for the health snapshot.
|
||||
- `--password <password>`: Gateway password for the health snapshot.
|
||||
- `--timeout <ms>`: status/health snapshot timeout (default `3000`).
|
||||
- `--no-stability-bundle`: skip persisted stability bundle lookup.
|
||||
- `--json`: print the written path, size, and manifest as JSON.
|
||||
|
||||
The export contains a manifest, a Markdown summary, config shape, sanitized config details, sanitized log summaries, sanitized Gateway status/health snapshots, and the newest stability bundle when one exists.
|
||||
|
||||
It is meant to be shared. It keeps operational details that help debugging, such as safe OpenClaw log fields, subsystem names, status codes, durations, configured modes, ports, plugin ids, provider ids, non-secret feature settings, and redacted operational log messages. It omits or redacts chat text, webhook bodies, tool outputs, credentials, cookies, account/message identifiers, prompt/instruction text, hostnames, and secret values. When a LogTape-style message looks like user/chat/tool payload text, the export keeps only that a message was omitted plus its byte count.
|
||||
|
||||
### `gateway status`
|
||||
|
||||
`gateway status` shows the Gateway service (launchd/systemd/schtasks) plus an optional probe of connectivity/auth capability.
|
||||
|
||||
@@ -26,6 +26,8 @@ Short guide to verify channel connectivity without guessing.
|
||||
- Creds on disk: `ls -l ~/.openclaw/credentials/whatsapp/<accountId>/creds.json` (mtime should be recent).
|
||||
- Session store: `ls -l ~/.openclaw/agents/<agentId>/sessions/sessions.json` (path can be overridden in config). Count and recent recipients are surfaced via `status`.
|
||||
- Relink flow: `openclaw channels logout && openclaw channels login --verbose` when status codes 409–515 or `loggedOut` appear in logs. (Note: the QR login flow auto-restarts once for status 515 after pairing.)
|
||||
- Diagnostics are enabled by default. The gateway records operational facts unless `diagnostics.enabled: false` is set. Memory events record RSS/heap byte counts, threshold pressure, and growth pressure. Oversized-payload events record what was rejected, truncated, or chunked, plus sizes and limits when available. They do not record the message text, attachment contents, webhook body, raw request or response body, tokens, cookies, or secret values. The same heartbeat starts the bounded stability recorder, which is available through `openclaw gateway stability` or the `diagnostics.stability` Gateway RPC. Fatal Gateway exits, shutdown timeouts, and restart startup failures persist the latest recorder snapshot under `~/.openclaw/logs/stability/` when events exist; inspect the newest saved bundle with `openclaw gateway stability --bundle latest`.
|
||||
- For bug reports, run `openclaw gateway diagnostics export` and attach the generated zip. The export combines a Markdown summary, the newest stability bundle, sanitized log metadata, sanitized Gateway status/health snapshots, and config shape. It is meant to be shared: chat text, webhook bodies, tool outputs, credentials, cookies, account/message identifiers, and secret values are omitted or redacted.
|
||||
|
||||
## Health monitor config
|
||||
|
||||
|
||||
@@ -18,6 +18,13 @@ handshake time.
|
||||
|
||||
- WebSocket, text frames with JSON payloads.
|
||||
- First frame **must** be a `connect` request.
|
||||
- Pre-connect frames are capped at 64 KiB. After a successful handshake, clients
|
||||
should follow the `hello-ok.policy.maxPayload` and
|
||||
`hello-ok.policy.maxBufferedBytes` limits. With diagnostics enabled,
|
||||
oversized inbound frames and slow outbound buffers emit `payload.large` events
|
||||
before the gateway closes or drops the affected frame. These events keep
|
||||
sizes, limits, surfaces, and safe reason codes. They do not keep the message
|
||||
body, attachment contents, raw frame body, tokens, cookies, or secret values.
|
||||
|
||||
## Handshake (connect)
|
||||
|
||||
@@ -265,6 +272,12 @@ implemented in `src/gateway/server-methods/*.ts`.
|
||||
### System and identity
|
||||
|
||||
- `health` returns the cached or freshly probed gateway health snapshot.
|
||||
- `diagnostics.stability` returns the recent bounded diagnostic stability
|
||||
recorder. It keeps operational metadata such as event names, counts, byte
|
||||
sizes, memory readings, queue/session state, channel/plugin names, and session
|
||||
ids. It does not keep chat text, webhook bodies, tool outputs, raw request or
|
||||
response bodies, tokens, cookies, or secret values. Operator read scope is
|
||||
required.
|
||||
- `status` returns the `/status`-style gateway summary; sensitive fields are
|
||||
included only for admin-scoped operator clients.
|
||||
- `gateway.identity.get` returns the gateway device identity used by relay and
|
||||
|
||||
@@ -329,6 +329,20 @@ Think of the suites as “increasing realism” (and increasing flakiness/cost):
|
||||
- `pnpm test:perf:profile:main` writes a main-thread CPU profile for Vitest/Vite startup and transform overhead.
|
||||
- `pnpm test:perf:profile:runner` writes runner CPU+heap profiles for the unit suite with file parallelism disabled.
|
||||
|
||||
### Stability (gateway)
|
||||
|
||||
- Command: `pnpm test:stability:gateway`
|
||||
- Config: `vitest.gateway.config.ts`, forced to one worker
|
||||
- Scope:
|
||||
- Starts a real loopback Gateway with diagnostics enabled by default
|
||||
- Drives synthetic gateway message, memory, and large-payload churn through the diagnostic event path
|
||||
- Queries `diagnostics.stability` over the Gateway WS RPC
|
||||
- Covers diagnostic stability bundle persistence helpers
|
||||
- Asserts the recorder remains bounded, synthetic RSS samples stay under the pressure budget, and per-session queue depths drain back to zero
|
||||
- Expectations:
|
||||
- CI-safe and keyless
|
||||
- Narrow lane for stability-regression follow-up, not a substitute for the full Gateway suite
|
||||
|
||||
### E2E (gateway smoke)
|
||||
|
||||
- Command: `pnpm test:e2e`
|
||||
|
||||
Reference in New Issue
Block a user