fix(gateway): capture config hash after plugin auto-enable to prevent restart loop (#67557)

Merged via squash.

Prepared head SHA: 07250958a7
Co-authored-by: openperf <80630709+openperf@users.noreply.github.com>
Co-authored-by: openperf <80630709+openperf@users.noreply.github.com>
Reviewed-by: @openperf
This commit is contained in:
Chunyue Wang
2026-04-16 21:18:24 +08:00
committed by GitHub
parent c3c7a9953f
commit 8c11210fe5
3 changed files with 37 additions and 1 deletions

View File

@@ -41,6 +41,7 @@ Docs: https://docs.openclaw.ai
- TUI/streaming: add a client-side streaming watchdog to `tui-event-handlers` so the `streaming · Xm Ys` activity indicator resets to `idle` after 30s of delta silence on the active run. Guards against lost or late `state: "final"` chat events (WS reconnects, gateway restarts, etc.) leaving the TUI stuck on `streaming` indefinitely; a new system log line surfaces the reset so users know to send a new message to resync. The window is configurable via the new `streamingWatchdogMs` context option (set to `0` to disable), and the handler now exposes a `dispose()` that clears the pending timer on shutdown. (#67401) Thanks @xantorres.
- Extensions/lmstudio: add exponential backoff to the inference-preload wrapper so an LM Studio model-load failure (for example the built-in memory guardrail rejecting a load because the swap is saturated) no longer produces a WARN line every ~2s for every chat request. The wrapper now records consecutive preload failures per `(baseUrl, modelKey, contextLength)` tuple with a 5s → 10s → 20s → … → 5min cooldown and skips the preload step entirely while a cooldown is active, letting chat requests proceed directly to the stream (the model is often already loaded via the LM Studio UI). The combined `preload failed` log line now reports consecutive-failure count and remaining cooldown so operators can act on the real issue instead of drowning in repeated warnings. (#67401) Thanks @xantorres.
- Agents/replay: re-run tool/result pairing after strict replay tool-call ID sanitization on outbound requests so Anthropic-compatible providers like MiniMax no longer receive malformed orphan tool-result IDs such as `...toolresult1` during compaction and retry flows. (#67620) Thanks @stainlu.
- Gateway/startup: fix spurious SIGUSR1 restart loop on Linux/systemd when plugin auto-enable is the only startup config write; the config hash guard was not captured for that write path, causing chokidar to treat each boot write as an external change and trigger a reload → restart cycle that corrupts manifest.db after repeated cycles. Fixes #67436. (#67557) thanks @openperf
## 2026.4.15-beta.1

View File

@@ -620,6 +620,34 @@ describe("startGatewayConfigReloader", () => {
await harness.reloader.stop();
});
it("does not dedupe when initialInternalWriteHash is null (#67436)", async () => {
const readSnapshot = vi
.fn<() => Promise<ConfigFileSnapshot>>()
.mockResolvedValueOnce(
makeSnapshot({
config: {
gateway: { reload: { debounceMs: 0 }, auth: { mode: "token", token: "startup" } },
},
hash: "startup-internal-1",
}),
);
const harness = createReloaderHarness(readSnapshot, {
initialInternalWriteHash: null,
});
harness.watcher.emit("change");
await vi.runOnlyPendingTimersAsync();
expect(readSnapshot).toHaveBeenCalledTimes(1);
// With a null hash the guard is a no-op, so the reload proceeds and
// detects a config diff → restart. This is the pre-fix regression
// scenario from #67436 where plugin auto-enable was the only startup
// writer and the hash was never captured.
expect(harness.onRestart).toHaveBeenCalledTimes(1);
await harness.reloader.stop();
});
});
describe("shouldInvalidateSkillsSnapshotForPaths", () => {

View File

@@ -284,7 +284,14 @@ export async function startGatewayServer(
log,
});
cfgAtStart = controlUiSeed.config;
if (authBootstrap.persistedGeneratedToken || controlUiSeed.persistedAllowedOriginsSeed) {
// Always capture the final config hash after all startup writes (plugin
// auto-enable, auth token generation, control-UI origin seeding) so the
// config reloader can recognize its own startup writes and suppress the
// spurious hot-reload that would otherwise trigger a SIGUSR1 restart loop.
// Previously the hash was only captured when auth or control-UI persisted
// changes, missing the plugin auto-enable write performed earlier inside
// loadGatewayStartupConfigSnapshot(). See #67436.
{
const startupSnapshot = await readConfigFileSnapshot();
startupInternalWriteHash = startupSnapshot.hash ?? null;
}