mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 09:50:42 +00:00
fix: clean up startup readiness PR docs
This commit is contained in:
@@ -68,7 +68,7 @@ Docs: https://docs.openclaw.ai
|
||||
- CLI/update: make package-update follow-up processes write completion results and exit explicitly, so Windows packaged upgrades do not hang after the new package finishes post-core plugin work. Thanks @vincentkoc.
|
||||
- Release validation: skip Slack live QA unless Slack credentials are explicitly configured, so release gates can keep proving non-Slack surfaces while Slack is still local and credential-gated. Thanks @vincentkoc.
|
||||
- Plugins/update: treat OpenClaw CalVer correction versions like `2026.5.3-1` as satisfying base plugin API ranges, so correction builds can install plugins that require the base runtime API. Fixes #77293. (#77450) Thanks @p3nchan.
|
||||
- Discord/reply-safety: preserve component-only channel payloads (for example `channelData.discord.components` and presentation/interactive-only replies) when final text scrubbing removes all text, so Discord still delivers non-text outbound payloads instead of dropping them. (#77478) Thanks @NikolaFC.
|
||||
- Discord/Gateway startup: retry Discord READY waits with backoff, defer startup `sessions.list` and native approval readiness failures until sidecars recover, and preserve component-only Discord payloads when final reply scrubbing removes all text. (#77478) Thanks @NikolaFC.
|
||||
- fix(gateway): clamp unbound websocket auth scopes [AI]. (#77413) Thanks @pgondhi987.
|
||||
- Gate zalouser startup name matching [AI]. (#77411) Thanks @pgondhi987.
|
||||
- Active Memory: send a bounded latest-message search query to the recall worker so channel/runtime metadata does not become the memory search string. Fixes #65309. Thanks @joeykrug, @westley3601, @pimenov, and @tasi333.
|
||||
|
||||
@@ -1,93 +0,0 @@
|
||||
# OpenClaw Startup Readiness And Leak Fix - 2026-05-04
|
||||
|
||||
## Current Truth
|
||||
|
||||
- Incident inputs confirmed Discord front-channel leakage of internal execution/commentary-like traces and Gateway startup instability in the same window.
|
||||
- Observed bad startup keywords from local operator evidence:
|
||||
- `gateway event loop readiness timeout`
|
||||
- `discord: gateway was not ready after 15000ms; restarting gateway`
|
||||
- `sessions.list` requests around 40 seconds
|
||||
- `exit 78` with systemd `RestartPreventExitStatus=78`
|
||||
- This source fix addresses the startup terminal-fail path and Discord final outbound leakage guard. It does not restart any running Gateway by itself.
|
||||
|
||||
## Code Changes
|
||||
|
||||
- Startup control-plane load shedding:
|
||||
- Added `sessions.list` to `STARTUP_UNAVAILABLE_GATEWAY_METHODS`.
|
||||
- During sidecar startup, Gateway now returns retryable startup `UNAVAILABLE` for `sessions.list` instead of dispatching the costly session scan path.
|
||||
- Native approval bootstrap readiness handling:
|
||||
- Changed approval-client readiness failure text away from the production incident keyword.
|
||||
- Changed exec-approval runtime readiness failure text away from the production incident keyword.
|
||||
- Classified gateway readiness/startup close errors as retryable bootstrap deferrals.
|
||||
- Normalized legacy readiness-timeout errors before logging retry deferrals, so old incident keywords do not reappear in native-approval retry logs.
|
||||
- Native approval handler startup now warns and retries instead of emitting the old terminal-looking `failed to start native approval handler` path for readiness-only failures.
|
||||
- Discord gateway READY wait:
|
||||
- Replaced the one-restart-then-throw startup behavior with reconnect plus 2 second backoff until READY, stop, or abort.
|
||||
- Removed the old log string `gateway was not ready after 15000ms; restarting gateway` from the nonfatal retry path.
|
||||
- Discord final outbound safety filter:
|
||||
- Added `extensions/discord/src/monitor/reply-safety.ts`.
|
||||
- `deliverDiscordReply` sanitizes payload text at the final Discord send boundary.
|
||||
- The filter uses the existing assistant-visible-text sanitizer, strips standalone internal trace/channel lines outside code fences, drops pure-internal text-only payloads, and preserves media-only payloads.
|
||||
|
||||
## Why This Should Work
|
||||
|
||||
- The startup window no longer allows Control UI `sessions.list` polling to compete with sidecar/channel readiness through the expensive session listing path.
|
||||
- Discord READY timeout no longer escalates a transient event-loop stall into a thrown startup failure after a single reconnect attempt.
|
||||
- Approval handler readiness failures are treated as recoverable gateway-readiness deferrals, matching the actual failure mode from the incident.
|
||||
- Leakage protection is placed at the last Discord send boundary, so upstream mistakes in agent output assembly, commentary routing, or tool-call formatting get one final scrub before front-channel delivery.
|
||||
|
||||
## Modified Files
|
||||
|
||||
- `src/gateway/server-startup-unavailable-methods.ts`
|
||||
- `src/gateway/operator-approvals-client.ts`
|
||||
- `src/infra/approval-handler-bootstrap.ts`
|
||||
- `src/infra/approval-handler-bootstrap.test.ts`
|
||||
- `src/infra/exec-approval-channel-runtime.ts`
|
||||
- `src/infra/exec-approval-channel-runtime.test.ts`
|
||||
- `extensions/discord/src/monitor/provider.lifecycle.ts`
|
||||
- `extensions/discord/src/monitor/provider.lifecycle.test.ts`
|
||||
- `extensions/discord/src/monitor/reply-delivery.ts`
|
||||
- `extensions/discord/src/monitor/reply-delivery.test.ts`
|
||||
- `extensions/discord/src/monitor/reply-safety.ts`
|
||||
- `docs/status/openclaw-startup-readiness-and-leak-fix-20260504.md`
|
||||
|
||||
## Validation
|
||||
|
||||
- `node scripts/run-vitest.mjs run --config test/vitest/vitest.extension-discord.config.ts extensions/discord/src/monitor/provider.lifecycle.test.ts extensions/discord/src/monitor/reply-delivery.test.ts`
|
||||
- Passed: 2 files, 28 tests.
|
||||
- `OPENCLAW_GATEWAY_PROJECT_SHARDS=1 node scripts/run-vitest.mjs run --config test/vitest/vitest.gateway.config.ts src/gateway/server-methods.control-plane-rate-limit.test.ts`
|
||||
- Passed: 1 file, 12 tests.
|
||||
- `node scripts/run-vitest.mjs run --config test/vitest/vitest.infra.config.ts src/infra/approval-handler-bootstrap.test.ts src/infra/exec-approval-channel-runtime.test.ts`
|
||||
- Passed: 2 files, 30 tests.
|
||||
- `git diff --check`
|
||||
- Passed.
|
||||
|
||||
## Acceptance Log Keywords
|
||||
|
||||
- Must stay absent during the 30-60 minute post-deploy startup soak:
|
||||
- `gateway event loop readiness timeout`
|
||||
- `discord: gateway was not ready after 15000ms; restarting gateway`
|
||||
- `discord gateway did not reach READY within 15000ms after restart`
|
||||
- `sessions.list` with 40 second scale durations
|
||||
- `exit 78`
|
||||
- Expected nonterminal readiness retry keyword if Discord is slow to become READY:
|
||||
- `discord: gateway READY wait timed out after 15000ms; reconnecting with backoff`
|
||||
- Expected approval bootstrap deferral keyword if Gateway is still starting:
|
||||
- `native approval handler deferred until gateway readiness recovers`
|
||||
|
||||
## Risks
|
||||
|
||||
- `sessions.list` is temporarily unavailable during startup until sidecars clear startup gating. Control UI must retry retryable `UNAVAILABLE` responses.
|
||||
- The Discord READY wait can keep reconnecting until stop/abort. If credentials or network are truly broken, operator-visible status remains `startup-not-ready` instead of crashing the Gateway.
|
||||
- The final outbound scrub intentionally removes standalone internal trace lines. A user-visible reply that literally begins with `analysis:`, `commentary:`, or tool execution labels outside a code fence will be stripped from Discord text. Code-fenced examples are preserved.
|
||||
|
||||
## Rollback
|
||||
|
||||
- Source rollback: `git revert <commit-hash>` from this repo.
|
||||
- If already deployed, rebuild/reinstall the reverted source using the normal OpenClaw packaging path, then restart the Gateway using the operator's configured service manager.
|
||||
|
||||
## Next Action
|
||||
|
||||
- Deploy this source build to an isolated or production-managed OpenClaw path.
|
||||
- Run a 30-60 minute startup soak with Control UI open and Discord connected.
|
||||
- During the soak, watch `/tmp/openclaw/openclaw-2026-05-04.log` or the active daily log for the acceptance keywords above.
|
||||
Reference in New Issue
Block a user