fix: bypass update restart cooldown

This commit is contained in:
Peter Steinberger
2026-05-01 09:54:57 +01:00
parent 9e01d19db3
commit 88da533714
8 changed files with 46 additions and 12 deletions

View File

@@ -55,7 +55,7 @@ Docs: https://docs.openclaw.ai
- Plugins/runtime-deps: recover interrupted bundled runtime-dependency installs whose package sentinels exist but generated materialization is incomplete, forcing npm/pnpm repair in Gateway startup, doctor, and lazy plugin loads instead of leaving channels crash-looping on missing packages. Fixes #75309; refs #75310, #75296, and #75304. Thanks @scottgl9.
- Plugins/runtime-deps: treat no-main and export-map package sentinels without reachable entry files as incomplete, so Gateway startup, doctor, and lazy plugin loads repair interrupted bundled dependency installs instead of accepting package.json-only partial installs. Fixes #75309; refs #75183. Thanks @shakkernerd.
- Plugins/runtime-deps: keep runtime inspection and channel maintenance commands from downloading bundled plugin dependencies, route explicit repairs through `openclaw plugins deps --repair`, and still allow Gateway/DO paths to repair missing deps before import. Refs #75069. Thanks @xiaohuaxi.
- Updates: force non-deferred update restarts after package-manager updates requested through the live Gateway control plane and fail release validation on post-swap stale chunk import crashes, so Telegram/Discord imports do not stay pointed at removed dist files. Fixes #75206. Thanks @xonaman and @faux123.
- Updates: force non-deferred, no-cooldown update restarts after package-manager updates requested through the live Gateway control plane and fail release validation on post-swap stale chunk import crashes, so Telegram/Discord imports do not stay pointed at removed dist files. Fixes #75206. Thanks @xonaman and @faux123.
- Agents/tool-result guard: use the resolved runtime context token budget for non-context-engine tool-result overflow checks, so long tool-heavy sessions no longer compact early when `contextTokens` is larger than native `contextWindow`. Fixes #74917. Thanks @kAIborg24.
- Gateway/systemd: exit with sysexits 78 for supervised lock and `EADDRINUSE` conflicts so `RestartPreventExitStatus=78` stops `Restart=always` restart loops instead of repeatedly reloading plugins against an occupied port. Fixes #75115. Thanks @yhyatt.
- Agents/runtime: skip blank visible user prompts at the embedded-runner boundary before provider submission while still allowing internal runtime-only turns and media-only prompts, so Telegram/group sessions no longer leak raw empty-input provider errors when replay history exists. Fixes #74137. Thanks @yelog, @Gracker, and @nhaener.

View File

@@ -84,9 +84,9 @@ install method aligned:
The Gateway core auto-updater (when enabled via config) launches the CLI update path
outside the live Gateway request handler. Control-plane `update.run` package-manager
updates force a non-deferred update restart after the package swap, because the old
Gateway process may still have in-memory chunks that point at files removed by the
new package.
updates force a non-deferred, no-cooldown update restart after the package swap,
because the old Gateway process may still have in-memory chunks that point at
files removed by the new package.
For package-manager installs, `openclaw update` resolves the target package
version before invoking the package manager. npm global installs use a staged
@@ -155,7 +155,7 @@ If an exact pinned npm plugin update resolves to an artifact whose integrity dif
<Note>
Post-update plugin sync failures fail the update result and stop restart follow-up work. Fix the plugin install or update error, then rerun `openclaw update`.
When the updated Gateway starts, enabled bundled plugin runtime dependencies are staged before plugin activation. Package-manager `update.run` restarts bypass the normal idle deferral after the package tree has been swapped, so the old process cannot keep lazy-loading removed chunks. Service-manager restarts still drain runtime-dependency staging before closing the Gateway.
When the updated Gateway starts, enabled bundled plugin runtime dependencies are staged before plugin activation. Package-manager `update.run` restarts bypass the normal idle deferral and restart cooldown after the package tree has been swapped, so the old process cannot keep lazy-loading removed chunks. Service-manager restarts still drain runtime-dependency staging before closing the Gateway.
If pnpm bootstrap still fails, the updater stops early with a package-manager-specific error instead of trying `npm run build` inside the checkout.
</Note>

View File

@@ -378,7 +378,7 @@ enumeration of `src/gateway/server-methods/*.ts`.
- `config.apply` validates + replaces the full config payload.
- `config.schema` returns the live config schema payload used by Control UI and CLI tooling: schema, `uiHints`, version, and generation metadata, including plugin + channel schema metadata when the runtime can load it. The schema includes field `title` / `description` metadata derived from the same labels and help text used by the UI, including nested object, wildcard, array-item, and `anyOf` / `oneOf` / `allOf` composition branches when matching field documentation exists.
- `config.schema.lookup` returns a path-scoped lookup payload for one config path: normalized path, a shallow schema node, matched hint + `hintPath`, and immediate child summaries for UI/CLI drill-down. Lookup schema nodes keep the user-facing docs and common validation fields (`title`, `description`, `type`, `enum`, `const`, `format`, `pattern`, numeric/string/array/object bounds, and flags like `additionalProperties`, `deprecated`, `readOnly`, `writeOnly`). Child summaries expose `key`, normalized `path`, `type`, `required`, `hasChildren`, plus the matched `hint` / `hintPath`.
- `update.run` runs the gateway update flow and schedules a restart only when the update itself succeeded. Package-manager updates force a non-deferred update restart after the package swap so the old Gateway process does not keep lazy-loading from a replaced `dist` tree.
- `update.run` runs the gateway update flow and schedules a restart only when the update itself succeeded. Package-manager updates force a non-deferred, no-cooldown update restart after the package swap so the old Gateway process does not keep lazy-loading from a replaced `dist` tree.
- `update.status` returns the latest cached update restart sentinel, including the post-restart running version when available.
- `wizard.start`, `wizard.next`, `wizard.status`, and `wizard.cancel` expose the onboarding wizard over WS RPC.

View File

@@ -169,11 +169,11 @@ The gateway also logs an update hint on startup (disable with `update.checkOnSta
For downgrade or incident recovery, set `OPENCLAW_NO_AUTO_UPDATE=1` in the gateway environment to block automatic applies even when `update.auto.enabled` is configured. Startup update hints can still run unless `update.checkOnStart` is also disabled.
Package-manager updates requested through the live Gateway control-plane handler
force a non-deferred update restart after the package swap. That avoids leaving
an old in-memory process around long enough to lazy-load chunks from a package
tree that has already been replaced. Shell `openclaw update` remains the
preferred path for supervised installs because it can stop and restart the
service around the update.
force a non-deferred, no-cooldown update restart after the package swap. That
avoids leaving an old in-memory process around long enough to lazy-load chunks
from a package tree that has already been replaced. Shell `openclaw update`
remains the preferred path for supervised installs because it can stop and
restart the service around the update.
## After updating

View File

@@ -297,6 +297,7 @@ describe("update.run restart scheduling", () => {
expect.objectContaining({
delayMs: 0,
reason: "update.run",
skipCooldown: true,
skipDeferral: true,
}),
);

View File

@@ -147,6 +147,7 @@ export const updateHandlers: GatewayRequestHandlers = {
delayMs: updateWasPackageSwap ? 0 : restartDelayMs,
reason: "update.run",
skipDeferral: updateWasPackageSwap,
skipCooldown: updateWasPackageSwap,
audit: {
actor: actor.actor,
deviceId: actor.deviceId,

View File

@@ -425,6 +425,34 @@ describe("infra runtime", () => {
process.removeListener("SIGUSR1", handler);
}
});
it("bypasses restart cooldown when requested", async () => {
const emitSpy = vi.spyOn(process, "emit");
const handler = () => {};
process.on("SIGUSR1", handler);
try {
scheduleGatewaySigusr1Restart({ delayMs: 0, reason: "first" });
await vi.advanceTimersByTimeAsync(0);
expect(consumeGatewaySigusr1RestartAuthorization()).toBe(true);
markGatewaySigusr1RestartHandled();
const forced = scheduleGatewaySigusr1Restart({
delayMs: 0,
reason: "update.run",
skipCooldown: true,
});
expect(forced.coalesced).toBe(false);
expect(forced.delayMs).toBe(0);
expect(forced.cooldownMsApplied).toBe(0);
await vi.advanceTimersByTimeAsync(0);
expect(emitSpy.mock.calls.filter((args) => args[0] === "SIGUSR1").length).toBe(2);
expect(peekGatewaySigusr1RestartReason()).toBe("update.run");
} finally {
process.removeListener("SIGUSR1", handler);
}
});
});
describe("pre-restart deferral check", () => {

View File

@@ -661,6 +661,7 @@ export function scheduleGatewaySigusr1Restart(opts?: {
audit?: RestartAuditInfo;
emitHooks?: RestartEmitHooks;
skipDeferral?: boolean;
skipCooldown?: boolean;
}): ScheduledRestart {
const delayMsRaw =
typeof opts?.delayMs === "number" && Number.isFinite(opts.delayMs)
@@ -674,7 +675,10 @@ export function scheduleGatewaySigusr1Restart(opts?: {
const hasSigusr1Listener = process.listenerCount("SIGUSR1") > 0;
const mode = hasSigusr1Listener ? "emit" : process.platform === "win32" ? "supervisor" : "signal";
const nowMs = Date.now();
const cooldownMsApplied = Math.max(0, lastRestartEmittedAt + RESTART_COOLDOWN_MS - nowMs);
const skipCooldown = opts?.skipCooldown === true;
const cooldownMsApplied = skipCooldown
? 0
: Math.max(0, lastRestartEmittedAt + RESTART_COOLDOWN_MS - nowMs);
const requestedDueAt = nowMs + delayMs + cooldownMsApplied;
const skipDeferral = opts?.skipDeferral === true;