fix: keep config recovery in doctor

This commit is contained in:
Peter Steinberger
2026-05-03 18:04:20 +01:00
parent dc32acd0d8
commit 0ee52e9405
23 changed files with 269 additions and 1092 deletions

View File

@@ -12,6 +12,7 @@ Docs: https://docs.openclaw.ai
- Agents/tools: skip optional media and PDF tool factories when the effective tool denylist already blocks them, avoiding unnecessary hot-path setup for tools that will be filtered out before model use. (#76773) Thanks @dorukardahan.
- Discord/status: let explicit reaction tool calls opt into tracking subsequent tool progress on the reacted message with `trackToolCalls: true`, and use the shared tool display emoji table for status reactions.
- Gateway/config: stop Gateway startup and hot reload from auto-restoring invalid config; invalid config now fails closed and `openclaw doctor --fix` owns last-known-good repair.
- Gateway/performance: lazy-load early runtime discovery and shutdown-hook helpers, defer maintenance timers until after readiness, and trim duplicate plugin auto-enable work during Gateway startup.
- QA/Mantis: add a `pnpm openclaw qa mantis discord-smoke` runner and manual GitHub workflow that verify the Mantis Discord bot can see the configured guild/channel, post a smoke message, add a reaction, and upload artifacts.
- Gateway/performance: lazy-load the heavy cron runtime after the rest of Gateway startup, defer restart-sentinel refresh after readiness, and let the Gateway startup benchmark write per-run V8 CPU profiles with `--cpu-prof-dir`.

View File

@@ -439,9 +439,9 @@ ls -lt "$CONFIG".rejected.* 2>/dev/null | head
openclaw config validate
```
Direct editor writes are still allowed, but the running Gateway treats them as untrusted until they validate. Invalid direct edits can be restored from the last-known-good backup during startup or hot reload. See [Gateway troubleshooting](/gateway/troubleshooting#gateway-restored-last-known-good-config).
Direct editor writes are still allowed, but the running Gateway treats them as untrusted until they validate. Invalid direct edits fail startup or are skipped by hot reload; Gateway does not rewrite `openclaw.json`. Run `openclaw doctor --fix` to repair prefixed/clobbered config or restore the last-known-good copy. See [Gateway troubleshooting](/gateway/troubleshooting#gateway-rejected-invalid-config).
Whole-file recovery is reserved for globally broken config, such as parse errors, root-level schema failures, legacy migration failures, or mixed plugin and root failures. If validation fails only under `plugins.entries.<id>...`, OpenClaw keeps the active `openclaw.json` in place and reports the plugin-local issue instead of restoring `.last-good`. This prevents plugin schema changes or `minHostVersion` skew from rolling back unrelated user settings such as models, providers, auth profiles, channels, gateway exposure, tools, memory, browser, or cron config.
Whole-file recovery is reserved for doctor repair. Plugin schema changes or `minHostVersion` skew stay loud instead of rolling back unrelated user settings such as models, providers, auth profiles, channels, gateway exposure, tools, memory, browser, or cron config.
## Subcommands

View File

@@ -104,10 +104,10 @@ is available, then fall back to `latest`.
</Note>
<AccordionGroup>
<Accordion title="Config includes and invalid-config recovery">
<Accordion title="Config includes and invalid-config repair">
If your `plugins` section is backed by a single-file `$include`, `plugins install/update/enable/disable/uninstall` write through to that included file and leave `openclaw.json` untouched. Root includes, include arrays, and includes with sibling overrides fail closed instead of flattening. See [Config includes](/gateway/configuration) for the supported shapes.
If config is invalid during install, `plugins install` normally fails closed and tells you to run `openclaw doctor --fix` first. During Gateway startup, invalid config for one plugin is isolated to that plugin so other channels and plugins can keep running; `openclaw doctor --fix` can quarantine the invalid plugin entry. The only documented install-time exception is a narrow bundled-plugin recovery path for plugins that explicitly opt into `openclaw.install.allowInvalidConfigRecovery`.
If config is invalid during install, `plugins install` normally fails closed and tells you to run `openclaw doctor --fix` first. During Gateway startup and hot reload, invalid plugin config fails closed like any other invalid config; `openclaw doctor --fix` can quarantine the invalid plugin entry. The only documented install-time exception is a narrow bundled-plugin recovery path for plugins that explicitly opt into `openclaw.install.allowInvalidConfigRecovery`.
</Accordion>
<Accordion title="--force and reinstall vs update">

View File

@@ -89,17 +89,13 @@ When validation fails:
- Run `openclaw doctor` to see exact issues
- Run `openclaw doctor --fix` (or `--yes`) to apply repairs
The Gateway keeps a trusted last-known-good copy after each successful startup.
If `openclaw.json` later fails validation (or drops `gateway.mode`, shrinks
sharply, or has a stray log line prepended), OpenClaw preserves the broken file
as `.clobbered.*`, restores the last-known-good copy, and logs the recovery
reason. The next agent turn also receives a system-event warning so the main
agent does not blindly rewrite the restored config. Promotion to last-known-good
is skipped when a candidate contains redacted secret placeholders such as `***`.
When every validation issue is scoped to `plugins.entries.<id>...`, OpenClaw
does not perform whole-file recovery. It keeps the current config active and
surfaces the plugin-local failure so a plugin schema or host-version mismatch
cannot roll back unrelated user settings.
The Gateway keeps a trusted last-known-good copy after each successful startup,
but startup and hot reload do not restore it automatically. If `openclaw.json`
fails validation (including plugin-local validation), Gateway startup fails or
the reload is skipped and the current runtime keeps the last accepted config.
Run `openclaw doctor --fix` (or `--yes`) to repair prefixed/clobbered config or
restore the last-known-good copy. Promotion to last-known-good is skipped when a
candidate contains redacted secret placeholders such as `***`.
## Common tasks
@@ -539,20 +535,15 @@ The Gateway watches `~/.openclaw/openclaw.json` and applies changes automaticall
Direct file edits are treated as untrusted until they validate. The watcher waits
for editor temp-write/rename churn to settle, reads the final file, and rejects
invalid external edits by restoring the last-known-good config. OpenClaw-owned
config writes use the same schema gate before writing; destructive clobbers such
as dropping `gateway.mode` or shrinking the file by more than half are rejected
and saved as `.rejected.*` for inspection.
invalid external edits without rewriting `openclaw.json`. OpenClaw-owned config
writes use the same schema gate before writing; destructive clobbers such as
dropping `gateway.mode` or shrinking the file by more than half are rejected and
saved as `.rejected.*` for inspection.
Plugin-local validation failures are the exception: if all issues are under
`plugins.entries.<id>...`, reload keeps the current config and reports the plugin
issue instead of restoring `.last-good`.
If you see `Config auto-restored from last-known-good` or
`config reload restored last-known-good config` in logs, inspect the matching
`.clobbered.*` file next to `openclaw.json`, fix the rejected payload, then run
`openclaw config validate`. See [Gateway troubleshooting](/gateway/troubleshooting#gateway-restored-last-known-good-config)
for the recovery checklist.
If you see `config reload skipped (invalid config)` or startup reports `Invalid
config`, inspect the config, run `openclaw config validate`, then run `openclaw
doctor --fix` for repair. See [Gateway troubleshooting](/gateway/troubleshooting#gateway-rejected-invalid-config)
for the checklist.
### Reload modes

View File

@@ -300,9 +300,10 @@ Related:
- [Configuration](/gateway/configuration)
- [Doctor](/gateway/doctor)
## Gateway restored last-known-good config
## Gateway rejected invalid config
Use this when the Gateway starts, but logs say it restored `openclaw.json`.
Use this when Gateway startup fails with `Invalid config` or hot reload logs say
it skipped an invalid edit.
```bash
openclaw logs --follow
@@ -313,19 +314,19 @@ openclaw doctor
Look for:
- `Config auto-restored from last-known-good`
- `gateway: invalid config was restored from last-known-good backup`
- `config reload restored last-known-good config after invalid-config`
- A timestamped `openclaw.json.clobbered.*` file beside the active config
- A main-agent system event that starts with `Config recovery warning`
- `Invalid config at ...`
- `config reload skipped (invalid config): ...`
- `Config write rejected: ...`
- A timestamped `openclaw.json.rejected.*` file beside the active config
- A timestamped `openclaw.json.clobbered.*` file if `doctor --fix` repaired a broken direct edit
<AccordionGroup>
<Accordion title="What happened">
- The rejected config did not validate during startup or hot reload.
- OpenClaw preserved the rejected payload as `.clobbered.*`.
- The active config was restored from the last validated last-known-good copy.
- The next main-agent turn is warned not to blindly rewrite the rejected config.
- If all validation issues were under `plugins.entries.<id>...`, OpenClaw would not restore the whole file. Plugin-local failures stay loud while unrelated user settings remain in the active config.
- The config did not validate during startup, hot reload, or an OpenClaw-owned write.
- Gateway startup fails closed instead of rewriting `openclaw.json`.
- Hot reload skips invalid external edits and keeps the current runtime config active.
- OpenClaw-owned writes reject invalid/destructive payloads before commit and save `.rejected.*`.
- `openclaw doctor --fix` owns repair. It can remove non-JSON prefixes or restore the last-known-good copy while preserving the rejected payload as `.clobbered.*`.
</Accordion>
<Accordion title="Inspect and repair">
@@ -338,16 +339,17 @@ Look for:
```
</Accordion>
<Accordion title="Common signatures">
- `.clobbered.*` exists → an external direct edit or startup read was restored.
- `.clobbered.*` exists → doctor preserved a broken external edit while repairing the active config.
- `.rejected.*` exists → an OpenClaw-owned config write failed schema or clobber checks before commit.
- `Config write rejected:` → the write tried to drop required shape, shrink the file sharply, or persist invalid config.
- `Rejected validation details:` → the recovery log or main-agent notice includes the schema path that caused the restore, such as `agents.defaults.execution` or `gateway.auth.password.source`.
- `missing-meta-vs-last-good`, `gateway-mode-missing-vs-last-good`, or `size-drop-vs-last-good:*` → startup treated the current file as clobbered because it lost fields or size compared with the last-known-good backup.
- `config reload skipped (invalid config):` → a direct edit failed validation and was ignored by the running Gateway.
- `Invalid config at ...` → startup failed before Gateway services booted.
- `missing-meta-vs-last-good`, `gateway-mode-missing-vs-last-good`, or `size-drop-vs-last-good:*` → an OpenClaw-owned write was rejected because it lost fields or size compared with the last-known-good backup.
- `Config last-known-good promotion skipped` → the candidate contained redacted secret placeholders such as `***`.
</Accordion>
<Accordion title="Fix options">
1. Keep the restored active config if it is correct.
1. Run `openclaw doctor --fix` to let doctor repair prefixed/clobbered config or restore last-known-good.
2. Copy only the intended keys from `.clobbered.*` or `.rejected.*`, then apply them with `openclaw config set` or `config.patch`.
3. Run `openclaw config validate` before restarting.
4. If you edit by hand, keep the full JSON5 config, not just the partial object you wanted to change.

View File

@@ -804,15 +804,15 @@ lives on the [First-run FAQ](/help/faq-first-run).
- OpenClaw-owned config writes validate the full post-change config before writing.
- Invalid or destructive OpenClaw-owned writes are rejected and saved as `openclaw.json.rejected.*`.
- If a direct edit breaks startup or hot reload, the Gateway restores the last-known-good config and saves the rejected file as `openclaw.json.clobbered.*`.
- The main agent receives a boot warning after recovery so it does not blindly write the bad config again.
- If a direct edit breaks startup or hot reload, Gateway fails closed or skips the reload; it does not rewrite `openclaw.json`.
- `openclaw doctor --fix` owns repair and can restore last-known-good while saving the rejected file as `openclaw.json.clobbered.*`.
Recover:
- Check `openclaw logs --follow` for `Config auto-restored from last-known-good`, `Config write rejected:`, or `config reload restored last-known-good config`.
- Check `openclaw logs --follow` for `Invalid config at`, `Config write rejected:`, or `config reload skipped (invalid config)`.
- Inspect the newest `openclaw.json.clobbered.*` or `openclaw.json.rejected.*` beside the active config.
- Keep the active restored config if it works, then copy only the intended keys back with `openclaw config set` or `config.patch`.
- Run `openclaw config validate` and `openclaw doctor`.
- Run `openclaw config validate` and `openclaw doctor --fix`.
- Copy only the intended keys back with `openclaw config set` or `config.patch`.
- If you have no last-known-good or rejected payload, restore from backup, or re-run `openclaw doctor` and reconfigure channels/models.
- If this was unexpected, file a bug and include your last known config or any backup.
- A local coding agent can often reconstruct a working config from logs or history.
@@ -825,7 +825,7 @@ lives on the [First-run FAQ](/help/faq-first-run).
- Use `config.patch` for partial RPC edits; keep `config.apply` for full-config replacement only.
- If you are using the owner-only `gateway` tool from an agent run, it will still reject writes to `tools.exec.ask` / `tools.exec.security` (including legacy `tools.bash.*` aliases that normalize to the same protected exec paths).
Docs: [Config](/cli/config), [Configure](/cli/configure), [Gateway troubleshooting](/gateway/troubleshooting#gateway-restored-last-known-good-config), [Doctor](/gateway/doctor).
Docs: [Config](/cli/config), [Configure](/cli/configure), [Gateway troubleshooting](/gateway/troubleshooting#gateway-rejected-invalid-config), [Doctor](/gateway/doctor).
</Accordion>

View File

@@ -99,11 +99,10 @@ If config is invalid, install normally fails closed and points you at
`openclaw doctor --fix`. The only recovery exception is a narrow bundled-plugin
reinstall path for plugins that opt into
`openclaw.install.allowInvalidConfigRecovery`.
During Gateway startup, invalid config for one plugin is isolated to that plugin:
startup logs the `plugins.entries.<id>.config` issue, skips that plugin during
load, and keeps other plugins and channels online. Run `openclaw doctor --fix`
to quarantine the bad plugin config by disabling that plugin entry and removing
its invalid config payload; the normal config backup keeps the previous values.
During Gateway startup, invalid plugin config fails closed like any other invalid
config. Run `openclaw doctor --fix` to quarantine the bad plugin config by
disabling that plugin entry and removing its invalid config payload; the normal
config backup keeps the previous values.
When a channel config references a plugin that is no longer discoverable but the
same stale plugin id remains in plugin config or install records, Gateway startup
logs warnings and skips that channel instead of blocking every other channel.

View File

@@ -31,15 +31,6 @@ const readBestEffortConfig = vi.fn(async () => configState.cfg);
const readConfigFileSnapshotWithPluginMetadata = vi.fn(async () => ({
snapshot: configState.snapshot,
}));
const recoverConfigFromLastKnownGood = vi.fn<(params?: unknown) => Promise<boolean>>(
async (_params?: unknown) => false,
);
const recoverConfigFromJsonRootSuffix = vi.fn<(snapshot?: unknown) => Promise<boolean>>(
async (_snapshot?: unknown) => false,
);
const writeRestartSentinel = vi.fn<(payload?: unknown) => Promise<string>>(
async (_payload?: unknown) => "/tmp/restart-sentinel.json",
);
const writeDiagnosticStabilityBundleForFailureSync = vi.fn((_reason: string, _error: unknown) => ({
status: "written" as const,
message: "wrote stability bundle: /tmp/openclaw-stability.json",
@@ -59,8 +50,6 @@ vi.mock("../../config/config.js", () => ({
readBestEffortConfig: () => readBestEffortConfig(),
readConfigFileSnapshot: async () => configState.snapshot,
readConfigFileSnapshotWithPluginMetadata: () => readConfigFileSnapshotWithPluginMetadata(),
recoverConfigFromLastKnownGood: (params: unknown) => recoverConfigFromLastKnownGood(params),
recoverConfigFromJsonRootSuffix: (snapshot: unknown) => recoverConfigFromJsonRootSuffix(snapshot),
}));
vi.mock("../../config/paths.js", () => ({
@@ -120,10 +109,6 @@ vi.mock("../../infra/ports.js", () => ({
inspectPortUsage: async () => ({ status: "free" }),
}));
vi.mock("../../infra/restart-sentinel.js", () => ({
writeRestartSentinel: (payload: unknown) => writeRestartSentinel(payload),
}));
vi.mock("../../infra/supervisor-markers.js", async (importOriginal) => {
const actual = await importOriginal<typeof import("../../infra/supervisor-markers.js")>();
return {
@@ -195,12 +180,6 @@ describe("gateway run option collisions", () => {
readConfigFileSnapshotWithPluginMetadata.mockClear();
controlUiState.root = "/tmp/openclaw-control-ui";
gatewayLogMessages.length = 0;
recoverConfigFromLastKnownGood.mockReset();
recoverConfigFromLastKnownGood.mockResolvedValue(false);
recoverConfigFromJsonRootSuffix.mockReset();
recoverConfigFromJsonRootSuffix.mockResolvedValue(false);
writeRestartSentinel.mockReset();
writeRestartSentinel.mockResolvedValue("/tmp/restart-sentinel.json");
writeDiagnosticStabilityBundleForFailureSync.mockClear();
startGatewayServer.mockClear();
setGatewayWsLogStyle.mockClear();
@@ -380,7 +359,31 @@ describe("gateway run option collisions", () => {
expect(readBestEffortConfig).not.toHaveBeenCalled();
});
it("restores last-known-good config before startup when the effective config is invalid", async () => {
it("blocks invalid startup config without automatic recovery", async () => {
configState.cfg = {};
configState.snapshot = {
exists: true,
valid: false,
path: "/tmp/openclaw-test-missing-config.json",
config: {},
parsed: null,
issues: [{ path: "<root>", message: "JSON5 parse failed" }],
legacyIssues: [],
};
await expect(runGatewayCli(["gateway", "run"])).rejects.toThrow("__exit__:78");
expect(runtimeErrors).toContain(
"Gateway start blocked: existing config is missing gateway.mode. Treat this as suspicious or clobbered config. Re-run `openclaw onboard --mode local` or `openclaw setup`, set gateway.mode=local manually, or pass --allow-unconfigured.",
);
expect(runtimeErrors).toContain(
`Config write audit: ${path.join("/tmp", "logs", "config-audit.jsonl")}`,
);
expect(readConfigFileSnapshotWithPluginMetadata).toHaveBeenCalledOnce();
expect(startGatewayServer).not.toHaveBeenCalled();
});
it("passes invalid startup snapshot through when explicitly allowed", async () => {
configState.cfg = {};
configState.snapshot = {
exists: true,
@@ -391,105 +394,17 @@ describe("gateway run option collisions", () => {
issues: [{ path: "<root>", message: "JSON5 parse failed" }],
legacyIssues: [],
};
recoverConfigFromLastKnownGood.mockImplementationOnce(async () => {
configState.snapshot = {
exists: true,
valid: true,
path: "/tmp/openclaw-test-missing-config.json",
config: {
gateway: {
mode: "local",
port: 19170,
auth: { mode: "none" },
},
},
parsed: {
gateway: {
mode: "local",
port: 19170,
auth: { mode: "none" },
},
},
issues: [],
legacyIssues: [],
};
return true;
});
await runGatewayCli(["gateway", "run", "--allow-unconfigured"]);
expect(recoverConfigFromLastKnownGood).toHaveBeenCalledWith({
snapshot: expect.objectContaining({
exists: true,
valid: false,
}),
reason: "gateway-run-invalid-config",
});
expect(writeRestartSentinel).toHaveBeenCalledWith({
kind: "config-auto-recovery",
status: "ok",
ts: expect.any(Number),
message:
"Gateway recovered automatically after a failed config change and restored the last known good configuration.",
stats: {
mode: "config-auto-recovery",
reason: "gateway-run-invalid-config",
after: { restoredFrom: "last-known-good" },
},
});
expect(gatewayLogMessages).toContain(
"gateway: restored invalid effective config from last-known-good backup: /tmp/openclaw-test-missing-config.json; Rejected validation details: <root>: JSON5 parse failed.",
);
expect(startGatewayServer).toHaveBeenCalledWith(
19170,
expect.objectContaining({
bind: "loopback",
auth: undefined,
}),
);
});
it("keeps startup recovery non-fatal when writing the recovery notice fails", async () => {
configState.cfg = {};
configState.snapshot = {
exists: true,
valid: false,
path: "/tmp/openclaw-test-missing-config.json",
config: {},
parsed: null,
issues: [{ path: "<root>", message: "JSON5 parse failed" }],
legacyIssues: [],
};
recoverConfigFromLastKnownGood.mockImplementationOnce(async () => {
configState.snapshot = {
exists: true,
valid: true,
path: "/tmp/openclaw-test-missing-config.json",
config: {
gateway: {
mode: "local",
},
},
parsed: {
gateway: {
mode: "local",
},
},
issues: [],
legacyIssues: [],
};
return true;
});
writeRestartSentinel.mockRejectedValueOnce(new Error("disk full"));
await runGatewayCli(["gateway", "run"]);
expect(startGatewayServer).toHaveBeenCalledWith(
18789,
expect.objectContaining({ bind: "loopback" }),
);
expect(gatewayLogMessages).toContain(
"gateway: failed to persist config auto-recovery notice: disk full",
expect.objectContaining({
bind: "loopback",
startupConfigSnapshotRead: expect.objectContaining({
snapshot: expect.objectContaining({ valid: false }),
}),
}),
);
});

View File

@@ -9,7 +9,6 @@ import type {
GatewayTailscaleMode,
ReadConfigFileSnapshotWithPluginMetadataResult,
} from "../../config/config.js";
import { formatConfigIssueSummary } from "../../config/issue-format.js";
import { CONFIG_PATH, resolveGatewayPort, resolveStateDir } from "../../config/paths.js";
import type { OpenClawConfig } from "../../config/types.openclaw.js";
import { hasConfiguredSecretInput } from "../../config/types.secrets.js";
@@ -102,8 +101,6 @@ type GatewayRunLogger = Pick<ReturnType<typeof createSubsystemLogger>, "info" |
* restart storm that can render low-resource hosts unresponsive.
*/
const EXIT_CONFIG_ERROR = 78;
const CONFIG_AUTO_RECOVERY_MESSAGE =
"Gateway recovered automatically after a failed config change and restored the last known good configuration.";
const GATEWAY_AUTH_MODES: readonly GatewayAuthMode[] = [
"none",
@@ -277,69 +274,12 @@ async function readGatewayStartupConfig(params: {
snapshot: ConfigFileSnapshot | null;
startupConfigSnapshotRead?: ReadConfigFileSnapshotWithPluginMetadataResult;
}> {
const {
readConfigFileSnapshotWithPluginMetadata,
recoverConfigFromLastKnownGood,
recoverConfigFromJsonRootSuffix,
} = await import("../../config/config.js");
let snapshotRead: ReadConfigFileSnapshotWithPluginMetadataResult | null =
const { readConfigFileSnapshotWithPluginMetadata } = await import("../../config/config.js");
const snapshotRead: ReadConfigFileSnapshotWithPluginMetadataResult | null =
await params.startupTrace.measure("cli.config-snapshot", () =>
readConfigFileSnapshotWithPluginMetadata().catch(() => null),
);
let snapshot: ConfigFileSnapshot | null = snapshotRead?.snapshot ?? null;
if (snapshot?.exists && !snapshot.valid) {
const invalidSnapshot = snapshot;
const recovered = await params.startupTrace.measure("cli.config-recovery", () =>
recoverConfigFromLastKnownGood({
snapshot: invalidSnapshot,
reason: "gateway-run-invalid-config",
}),
);
if (recovered) {
const issueSummary = formatConfigIssueSummary([
...invalidSnapshot.issues,
...invalidSnapshot.legacyIssues,
]);
gatewayLog.warn(
`gateway: restored invalid effective config from last-known-good backup: ${invalidSnapshot.path}${issueSummary ? `; Rejected validation details: ${issueSummary}.` : ""}`,
);
try {
const { writeRestartSentinel } = await import("../../infra/restart-sentinel.js");
await writeRestartSentinel({
kind: "config-auto-recovery",
status: "ok",
ts: Date.now(),
message: CONFIG_AUTO_RECOVERY_MESSAGE,
stats: {
mode: "config-auto-recovery",
reason: "gateway-run-invalid-config",
after: { restoredFrom: "last-known-good" },
},
});
} catch (err) {
gatewayLog.warn(
`gateway: failed to persist config auto-recovery notice: ${formatErrorMessage(err)}`,
);
}
snapshotRead = await params.startupTrace.measure("cli.config-snapshot-reload", () =>
readConfigFileSnapshotWithPluginMetadata().catch(() => null),
);
snapshot = snapshotRead?.snapshot ?? null;
} else {
const repaired = await params.startupTrace.measure("cli.config-prefix-recovery", () =>
recoverConfigFromJsonRootSuffix(invalidSnapshot),
);
if (repaired) {
gatewayLog.warn(
`gateway: repaired invalid effective config by stripping a non-JSON prefix: ${invalidSnapshot.path}`,
);
snapshotRead = await params.startupTrace.measure("cli.config-snapshot-reload", () =>
readConfigFileSnapshotWithPluginMetadata().catch(() => null),
);
snapshot = snapshotRead?.snapshot ?? null;
}
}
}
const snapshot: ConfigFileSnapshot | null = snapshotRead?.snapshot ?? null;
const cfg = snapshot?.config ?? {};
return {
cfg,

View File

@@ -1,4 +1,6 @@
import fs from "node:fs/promises";
import { describe, expect, it } from "vitest";
import { promoteConfigSnapshotToLastKnownGood, readConfigFileSnapshot } from "../config/config.js";
import { withTempHome, writeOpenClawConfig } from "../config/test-helpers.js";
import { runDoctorConfigPreflight } from "./doctor-config-preflight.js";
@@ -28,4 +30,33 @@ describe("runDoctorConfigPreflight", () => {
});
});
});
it("restores invalid config from last-known-good only during repair preflight", async () => {
await withTempHome(async (home) => {
const configPath = await writeOpenClawConfig(home, {
gateway: { mode: "local", port: 19091 },
});
await promoteConfigSnapshotToLastKnownGood(await readConfigFileSnapshot());
const lastGoodRaw = await fs.readFile(configPath, "utf-8");
await fs.writeFile(configPath, "{ invalid json", "utf-8");
const inspectOnly = await runDoctorConfigPreflight({
migrateState: false,
migrateLegacyConfig: false,
invalidConfigNote: false,
});
expect(inspectOnly.snapshot.valid).toBe(false);
const repaired = await runDoctorConfigPreflight({
migrateState: false,
migrateLegacyConfig: false,
repairPrefixedConfig: true,
invalidConfigNote: false,
});
expect(repaired.snapshot.valid).toBe(true);
expect(repaired.snapshot.config.gateway?.mode).toBe("local");
expect(await fs.readFile(configPath, "utf-8")).toBe(lastGoodRaw);
});
});
});

View File

@@ -1,6 +1,10 @@
import fs from "node:fs/promises";
import path from "node:path";
import { readConfigFileSnapshot, recoverConfigFromJsonRootSuffix } from "../config/io.js";
import {
readConfigFileSnapshot,
recoverConfigFromJsonRootSuffix,
recoverConfigFromLastKnownGood,
} from "../config/io.js";
import { formatConfigIssueLines } from "../config/issue-format.js";
import type { LegacyConfigIssue } from "../config/types.js";
import type { OpenClawConfig } from "../config/types.openclaw.js";
@@ -105,14 +109,19 @@ export async function runDoctorConfigPreflight(
}
let snapshot = addDoctorLegacyIssues(await readConfigFileSnapshot());
if (
options.repairPrefixedConfig === true &&
snapshot.exists &&
!snapshot.valid &&
(await recoverConfigFromJsonRootSuffix(snapshot))
) {
note("Removed non-JSON prefix from openclaw.json; original saved as .clobbered.*.", "Config");
snapshot = addDoctorLegacyIssues(await readConfigFileSnapshot());
if (options.repairPrefixedConfig === true && snapshot.exists && !snapshot.valid) {
if (await recoverConfigFromJsonRootSuffix(snapshot)) {
note("Removed non-JSON prefix from openclaw.json; original saved as .clobbered.*.", "Config");
snapshot = addDoctorLegacyIssues(await readConfigFileSnapshot());
} else if (
await recoverConfigFromLastKnownGood({ snapshot, reason: "doctor-invalid-config" })
) {
note(
"Restored openclaw.json from last-known-good; original saved as .clobbered.*.",
"Config",
);
snapshot = addDoctorLegacyIssues(await readConfigFileSnapshot());
}
}
const invalidConfigNote =
options.invalidConfigNote ?? "Config invalid; doctor will run with best-effort config.";

View File

@@ -1,3 +1,4 @@
import fs from "node:fs/promises";
import { describe, expect, it } from "vitest";
import {
readBestEffortConfig,
@@ -7,6 +8,26 @@ import {
import { withTempHome, writeOpenClawConfig } from "./test-helpers.js";
describe("readBestEffortConfig", () => {
it("does not restore suspicious direct edits from .bak during ordinary reads", async () => {
await withTempHome(async (home) => {
const configPath = await writeOpenClawConfig(home, {
meta: { lastTouchedAt: "2026-04-22T00:00:00.000Z" },
update: { channel: "beta" },
gateway: { mode: "local" },
});
await fs.copyFile(configPath, `${configPath}.bak`);
const directEditRaw = `${JSON.stringify({ update: { channel: "beta" } }, null, 2)}\n`;
await fs.writeFile(configPath, directEditRaw, "utf-8");
const snapshot = await readConfigFileSnapshot();
expect(snapshot.sourceConfig).toEqual({ update: { channel: "beta" } });
expect(await fs.readFile(configPath, "utf-8")).toBe(directEditRaw);
const entries = await fs.readdir(`${home}/.openclaw`);
expect(entries.some((entry) => entry.startsWith("openclaw.json.clobbered."))).toBe(false);
});
});
it("reuses valid snapshots while preserving load-time defaults", async () => {
await withTempHome(async (home) => {
await writeOpenClawConfig(home, {

View File

@@ -50,14 +50,9 @@ import {
snapshotConfigAuditProcessInfo,
type ConfigWriteAuditResult,
} from "./io.audit.js";
import {
persistBoundedClobberedConfigSnapshot,
persistBoundedClobberedConfigSnapshotSync,
} from "./io.clobber-snapshot.js";
import { persistBoundedClobberedConfigSnapshot } from "./io.clobber-snapshot.js";
import { throwInvalidConfig } from "./io.invalid-config.js";
import {
maybeRecoverSuspiciousConfigRead,
maybeRecoverSuspiciousConfigReadSync,
promoteConfigSnapshotToLastKnownGood as promoteConfigSnapshotToLastKnownGoodWithDeps,
recoverConfigFromLastKnownGood as recoverConfigFromLastKnownGoodWithDeps,
} from "./io.observe-recovery.js";
@@ -667,13 +662,6 @@ async function observeConfigSnapshot(
const backup =
(backupBaseline?.hash ? backupBaseline : null) ??
(await readConfigFingerprintForPath(deps, `${snapshot.path}.bak`));
const clobberedPath = await persistBoundedClobberedConfigSnapshot({
deps,
configPath: snapshot.path,
raw: snapshot.raw,
observedAt: now,
});
deps.logger.warn(`Config observe anomaly: ${snapshot.path} (${suspicious.join(", ")})`);
await appendConfigAuditRecord({
fs: deps.fs,
@@ -723,7 +711,7 @@ async function observeConfigSnapshot(
backupUid: backup?.uid ?? null,
backupGid: backup?.gid ?? null,
backupGatewayMode: backup?.gatewayMode ?? null,
clobberedPath,
clobberedPath: null,
restoredFromBackup: false,
restoredBackupPath: null,
restoreErrorCode: null,
@@ -799,13 +787,6 @@ function observeConfigSnapshotSync(
const backup =
(backupBaseline?.hash ? backupBaseline : null) ??
readConfigFingerprintForPathSync(deps, `${snapshot.path}.bak`);
const clobberedPath = persistBoundedClobberedConfigSnapshotSync({
deps,
configPath: snapshot.path,
raw: snapshot.raw,
observedAt: now,
});
deps.logger.warn(`Config observe anomaly: ${snapshot.path} (${suspicious.join(", ")})`);
appendConfigAuditRecordSync({
fs: deps.fs,
@@ -855,7 +836,7 @@ function observeConfigSnapshotSync(
backupUid: backup?.uid ?? null,
backupGid: backup?.gid ?? null,
backupGatewayMode: backup?.gatewayMode ?? null,
clobberedPath,
clobberedPath: null,
restoredFromBackup: false,
restoredBackupPath: null,
restoreErrorCode: null,
@@ -1520,25 +1501,17 @@ export function createConfigIO(
}
const raw = deps.fs.readFileSync(configPath, "utf-8");
const parsed = deps.json5.parse(raw);
const recovered = maybeRecoverSuspiciousConfigReadSync({
deps,
configPath,
raw,
parsed,
});
const effectiveRaw = recovered.raw;
const effectiveParsed = recovered.parsed;
const readResolution = resolveConfigForRead(
resolveConfigIncludesForRead(effectiveParsed, configPath, deps),
resolveConfigIncludesForRead(parsed, configPath, deps),
deps.env,
);
const resolvedConfig = readResolution.resolvedConfigRaw;
const installMigration = migrateAndStripShippedPluginInstallConfigRecords(resolvedConfig, {
rootConfigRaw: effectiveParsed,
rootConfigRaw: parsed,
});
const effectiveConfigRaw = installMigration.config;
const snapshotRaw = installMigration.persistedRootRaw ?? effectiveRaw;
const snapshotParsed = installMigration.persistedRootParsed ?? effectiveParsed;
const snapshotRaw = installMigration.persistedRootRaw ?? raw;
const snapshotParsed = installMigration.persistedRootParsed ?? parsed;
const hash = hashConfigRaw(snapshotRaw);
for (const w of readResolution.envWarnings) {
deps.logger.warn(
@@ -1588,7 +1561,7 @@ export function createConfigIO(
env: deps.env,
pluginValidation: overrides.pluginValidation,
loadPluginMetadataSnapshot: loadValidationPluginMetadataSnapshot,
sourceRaw: effectiveParsed,
sourceRaw: snapshotParsed,
});
if (!validated.ok) {
observeLoadConfigSnapshot({
@@ -1721,18 +1694,9 @@ export function createConfigIO(
fallbackSourceConfig = coerceConfig(parsedRes.parsed);
// Resolve $include directives
const recovered = await deps.measure("config.snapshot.read.recovery-check", () =>
maybeRecoverSuspiciousConfigRead({
deps,
configPath,
raw,
parsed: parsedRes.parsed,
}),
);
const effectiveRaw = recovered.raw;
const effectiveParsed = recovered.parsed;
const hash = hashConfigRaw(effectiveRaw);
fallbackRaw = effectiveRaw;
const effectiveParsed = parsedRes.parsed;
const hash = rawHash;
fallbackRaw = raw;
fallbackParsed = effectiveParsed;
fallbackSourceConfig = coerceConfig(effectiveParsed);
fallbackHash = hash;
@@ -1751,9 +1715,8 @@ export function createConfigIO(
snapshot: createConfigFileSnapshot({
path: configPath,
exists: true,
raw: effectiveRaw,
raw,
parsed: effectiveParsed,
// Keep the recovered root file payload here when read healing kicked in.
sourceConfig: coerceConfig(effectiveParsed),
valid: false,
runtimeConfig: coerceConfig(effectiveParsed),
@@ -1787,7 +1750,7 @@ export function createConfigIO(
}),
);
const effectiveConfigRaw = installMigration.config;
const snapshotRaw = installMigration.persistedRootRaw ?? effectiveRaw;
const snapshotRaw = installMigration.persistedRootRaw ?? raw;
const snapshotParsed = installMigration.persistedRootParsed ?? effectiveParsed;
const snapshotHash = installMigration.persistedRootRaw
? hashConfigRaw(installMigration.persistedRootRaw)
@@ -1984,18 +1947,11 @@ export function createConfigIO(
return {};
}
const recovered = await maybeRecoverSuspiciousConfigRead({
deps,
configPath,
raw,
parsed: parsedRes.parsed,
});
let resolved: unknown;
try {
resolved = resolveConfigIncludesForRead(recovered.parsed, configPath, deps);
resolved = resolveConfigIncludesForRead(parsedRes.parsed, configPath, deps);
} catch {
return coerceConfig(recovered.parsed);
return coerceConfig(parsedRes.parsed);
}
const readResolution = resolveConfigForRead(resolved, deps.env);

View File

@@ -1,63 +0,0 @@
import { afterEach, describe, expect, it } from "vitest";
import {
drainSystemEvents,
peekSystemEvents,
resetSystemEventsForTest,
} from "../infra/system-events.js";
import {
enqueueConfigRecoveryNotice,
formatConfigRecoveryNotice,
} from "./config-recovery-notice.js";
describe("config recovery notice", () => {
afterEach(() => {
resetSystemEventsForTest();
});
it("formats a prompt-facing warning for recovered configs", () => {
expect(
formatConfigRecoveryNotice({
phase: "startup",
reason: "startup-invalid-config",
configPath: "/home/test/.openclaw/openclaw.json",
}),
).toBe(
"Config recovery warning: OpenClaw restored openclaw.json from the last-known-good backup during startup (startup-invalid-config). The rejected config was invalid and was preserved as a timestamped .clobbered.* file. Do not write openclaw.json again unless you validate the full config first.",
);
});
it("includes rejected validation details when available", () => {
expect(
formatConfigRecoveryNotice({
phase: "startup",
reason: "startup-invalid-config",
configPath: "/home/test/.openclaw/openclaw.json",
issues: [
{ path: "agents.defaults.execution", message: "Unrecognized key: execution" },
{ path: "gateway.auth.password.source", message: "Required" },
],
}),
).toContain(
"Rejected validation details: agents.defaults.execution: Unrecognized key: execution; gateway.auth.password.source: Required.",
);
});
it("queues the notice for the main agent session", () => {
expect(
enqueueConfigRecoveryNotice({
cfg: {},
phase: "reload",
reason: "reload-invalid-config",
configPath: "/home/test/.openclaw/openclaw.json",
issues: [{ path: "gateway.mode", message: "Expected string" }],
}),
).toBe(true);
expect(peekSystemEvents("agent:main:main")).toHaveLength(1);
const notice = drainSystemEvents("agent:main:main")[0];
expect(notice).toContain("gateway.mode: Expected string");
expect(notice).toContain(
"Do not write openclaw.json again unless you validate the full config first.",
);
});
});

View File

@@ -1,44 +0,0 @@
import path from "node:path";
import { formatConfigIssueSummary, type ConfigIssueLineInput } from "../config/issue-format.js";
import { resolveMainSessionKey } from "../config/sessions/main-session.js";
import type { OpenClawConfig } from "../config/types.openclaw.js";
import { enqueueSystemEvent } from "../infra/system-events.js";
export type ConfigRecoveryNoticePhase = "startup" | "reload";
export function formatConfigRecoveryIssueSentence(
issues: ReadonlyArray<ConfigIssueLineInput> | undefined,
): string | null {
const summary = formatConfigIssueSummary(issues ?? []);
return summary ? `Rejected validation details: ${summary}.` : null;
}
export function formatConfigRecoveryNotice(params: {
phase: ConfigRecoveryNoticePhase;
reason: string;
configPath: string;
issues?: ReadonlyArray<ConfigIssueLineInput>;
}): string {
const configName = path.basename(params.configPath) || "openclaw.json";
return [
`Config recovery warning: OpenClaw restored ${configName} from the last-known-good backup during ${params.phase} (${params.reason}).`,
"The rejected config was invalid and was preserved as a timestamped .clobbered.* file.",
formatConfigRecoveryIssueSentence(params.issues),
`Do not write ${configName} again unless you validate the full config first.`,
]
.filter((line): line is string => Boolean(line))
.join(" ");
}
export function enqueueConfigRecoveryNotice(params: {
cfg: OpenClawConfig;
phase: ConfigRecoveryNoticePhase;
reason: string;
configPath: string;
issues?: ReadonlyArray<ConfigIssueLineInput>;
}): boolean {
return enqueueSystemEvent(formatConfigRecoveryNotice(params), {
sessionKey: resolveMainSessionKey(params.cfg),
contextKey: `config-recovery:${params.phase}:${params.reason}`,
});
}

View File

@@ -578,15 +578,9 @@ function createReloaderHarness(
options: {
initialCompareConfig?: OpenClawConfig;
initialInternalWriteHash?: string | null;
recoverSnapshot?: (snapshot: ConfigFileSnapshot, reason: string) => Promise<boolean>;
promoteSnapshot?: (snapshot: ConfigFileSnapshot, reason: string) => Promise<boolean>;
initialPluginInstallRecords?: Record<string, PluginInstallRecord>;
readPluginInstallRecords?: () => Promise<Record<string, PluginInstallRecord>>;
onRecovered?: (params: {
reason: string;
snapshot: ConfigFileSnapshot;
recoveredSnapshot: ConfigFileSnapshot;
}) => void | Promise<void>;
} = {},
) {
const watcher = createWatcherMock();
@@ -612,11 +606,9 @@ function createReloaderHarness(
initialCompareConfig: options.initialCompareConfig,
initialInternalWriteHash: options.initialInternalWriteHash,
readSnapshot,
recoverSnapshot: options.recoverSnapshot,
promoteSnapshot: options.promoteSnapshot,
initialPluginInstallRecords: options.initialPluginInstallRecords ?? {},
readPluginInstallRecords: options.readPluginInstallRecords ?? (async () => ({})),
onRecovered: options.onRecovered,
subscribeToWrites,
onHotReload,
onRestart,
@@ -740,64 +732,35 @@ describe("startGatewayConfigReloader", () => {
}
});
it("restores last-known-good on invalid external config edits and reloads recovered snapshot", async () => {
const readSnapshot = vi
.fn<() => Promise<ConfigFileSnapshot>>()
.mockResolvedValueOnce(
makeSnapshot({
valid: false,
raw: "{ gateway: { mode: 123 } }",
issues: [{ path: "gateway.mode", message: "Expected string" }],
hash: "bad-1",
}),
)
.mockResolvedValueOnce(
makeSnapshot({
config: {
gateway: { reload: { debounceMs: 0 } },
hooks: { enabled: true },
},
hash: "last-good-1",
}),
);
const recoverSnapshot = vi.fn(async () => true);
it("skips invalid external config edits without recovery", async () => {
const readSnapshot = vi.fn<() => Promise<ConfigFileSnapshot>>().mockResolvedValueOnce(
makeSnapshot({
valid: false,
raw: "{ gateway: { mode: 123 } }",
issues: [{ path: "gateway.mode", message: "Expected string" }],
hash: "bad-1",
}),
);
const promoteSnapshot = vi.fn(async () => true);
const onRecovered = vi.fn();
const { watcher, onHotReload, onRestart, log, reloader } = createReloaderHarness(readSnapshot, {
recoverSnapshot,
promoteSnapshot,
onRecovered,
});
watcher.emit("change");
await vi.runAllTimersAsync();
expect(recoverSnapshot).toHaveBeenCalledWith(
expect.objectContaining({ valid: false }),
"invalid-config",
);
expect(readSnapshot).toHaveBeenCalledTimes(2);
expect(onRecovered).toHaveBeenCalledWith(
expect.objectContaining({
reason: "invalid-config",
snapshot: expect.objectContaining({ valid: false }),
recoveredSnapshot: expect.objectContaining({ hash: "last-good-1" }),
}),
);
expect(onHotReload).toHaveBeenCalledTimes(1);
expect(readSnapshot).toHaveBeenCalledTimes(1);
expect(onHotReload).not.toHaveBeenCalled();
expect(onRestart).not.toHaveBeenCalled();
expect(promoteSnapshot).toHaveBeenCalledWith(
expect.objectContaining({ hash: "last-good-1" }),
"valid-config",
);
expect(promoteSnapshot).not.toHaveBeenCalled();
expect(log.warn).toHaveBeenCalledWith(
"config reload restored last-known-good config after invalid-config; Rejected validation details: gateway.mode: Expected string.",
"config reload skipped (invalid config): gateway.mode: Expected string",
);
await reloader.stop();
});
it("queues restart in degraded mode for plugin-local invalid reloads", async () => {
it("skips plugin-local invalid reloads without degraded mode", async () => {
const activeConfig: OpenClawConfig = {
gateway: { reload: { debounceMs: 0 } },
agents: { defaults: { model: "gpt-5.4" } },
@@ -828,7 +791,6 @@ describe("startGatewayConfigReloader", () => {
const readSnapshot = vi
.fn<() => Promise<ConfigFileSnapshot>>()
.mockResolvedValueOnce(invalidSnapshot);
const recoverSnapshot = vi.fn(async () => true);
const promoteSnapshot = vi.fn(async () => true);
const previousConfig: OpenClawConfig = {
...activeConfig,
@@ -843,44 +805,19 @@ describe("startGatewayConfigReloader", () => {
};
const { watcher, onHotReload, onRestart, log, reloader } = createReloaderHarness(readSnapshot, {
initialCompareConfig: previousConfig,
recoverSnapshot,
promoteSnapshot,
});
watcher.emit("change");
await vi.runAllTimersAsync();
expect(recoverSnapshot).not.toHaveBeenCalled();
expect(readSnapshot).toHaveBeenCalledTimes(1);
expect(onRestart).not.toHaveBeenCalled();
expect(onHotReload).toHaveBeenCalledTimes(1);
expect(onHotReload).toHaveBeenCalledWith(
expect.objectContaining({
changedPaths: ["plugins.entries.lossless-claw.config.cacheAwareCompaction"],
restartGateway: false,
reloadPlugins: true,
hotReasons: ["plugins.entries.lossless-claw.config.cacheAwareCompaction"],
}),
expect.objectContaining({
plugins: expect.objectContaining({
entries: expect.objectContaining({
"lossless-claw": expect.objectContaining({
enabled: true,
config: expect.objectContaining({
cacheAwareCompaction: true,
}),
}),
}),
}),
}),
);
expect(onHotReload).not.toHaveBeenCalled();
expect(promoteSnapshot).not.toHaveBeenCalled();
expect(log.warn).toHaveBeenCalledWith(
"config reload recovery skipped after invalid-config: invalidity is scoped to plugin entries",
);
expect(log.warn).toHaveBeenCalledWith(
expect.stringContaining(
"config reload skipped plugin config validation issue at plugins.entries.lossless-claw.config.cacheAwareCompaction:",
"config reload skipped (invalid config): plugins.entries.lossless-claw.config.cacheAwareCompaction:",
),
);

View File

@@ -1,16 +1,10 @@
import chokidar from "chokidar";
import { bumpSkillsSnapshotVersion } from "../agents/skills/refresh-state.js";
import type { ConfigWriteNotification } from "../config/io.js";
import { formatConfigIssueLines, formatConfigIssueSummary } from "../config/issue-format.js";
import { materializeRuntimeConfig } from "../config/materialize.js";
import {
isPluginLocalInvalidConfigSnapshot,
shouldAttemptLastKnownGoodRecovery,
} from "../config/recovery-policy.js";
import { formatConfigIssueLines } from "../config/issue-format.js";
import { resolveConfigWriteFollowUp } from "../config/runtime-snapshot.js";
import type { ConfigFileSnapshot, OpenClawConfig } from "../config/types.openclaw.js";
import type { PluginInstallRecord } from "../config/types.plugins.js";
import { validateConfigObjectWithPlugins } from "../config/validation.js";
import {
loadInstalledPluginIndexInstallRecords,
loadInstalledPluginIndexInstallRecordsSync,
@@ -73,39 +67,6 @@ function isNoopReloadPlan(plan: GatewayReloadPlan): boolean {
);
}
function resolvePluginLocalInvalidReloadSnapshot(params: {
snapshot: ConfigFileSnapshot;
log: {
warn: (msg: string) => void;
};
}): ConfigFileSnapshot | null {
if (!isPluginLocalInvalidConfigSnapshot(params.snapshot)) {
return null;
}
const validated = validateConfigObjectWithPlugins(params.snapshot.sourceConfig, {
pluginValidation: "skip",
});
if (!validated.ok) {
return null;
}
const runtimeConfig = materializeRuntimeConfig(validated.config, "load");
for (const issue of params.snapshot.issues) {
params.log.warn(
`config reload skipped plugin config validation issue at ${issue.path}: ${issue.message}. Run "openclaw doctor --fix" to quarantine the plugin config.`,
);
}
return {
...params.snapshot,
sourceConfig: params.snapshot.sourceConfig,
resolved: params.snapshot.resolved,
valid: true,
runtimeConfig,
config: runtimeConfig,
issues: [],
warnings: [...params.snapshot.warnings, ...params.snapshot.issues, ...validated.warnings],
};
}
type GatewayConfigReloader = {
stop: () => Promise<void>;
};
@@ -127,15 +88,9 @@ export function startGatewayConfigReloader(opts: {
readSnapshot: () => Promise<ConfigFileSnapshot>;
onHotReload: (plan: GatewayReloadPlan, nextConfig: OpenClawConfig) => Promise<void>;
onRestart: (plan: GatewayReloadPlan, nextConfig: OpenClawConfig) => void | Promise<void>;
recoverSnapshot?: (snapshot: ConfigFileSnapshot, reason: string) => Promise<boolean>;
promoteSnapshot?: (snapshot: ConfigFileSnapshot, reason: string) => Promise<boolean>;
initialPluginInstallRecords?: PluginInstallRecords;
readPluginInstallRecords?: () => Promise<PluginInstallRecords>;
onRecovered?: (params: {
reason: string;
snapshot: ConfigFileSnapshot;
recoveredSnapshot: ConfigFileSnapshot;
}) => void | Promise<void>;
subscribeToWrites?: (listener: (event: ConfigWriteNotification) => void) => () => void;
log: {
info: (msg: string) => void;
@@ -222,41 +177,6 @@ export function startGatewayConfigReloader(opts: {
return true;
};
const recoverAndReadSnapshot = async (
snapshot: ConfigFileSnapshot,
reason: string,
): Promise<ConfigFileSnapshot | null> => {
if (!opts.recoverSnapshot) {
return null;
}
if (!shouldAttemptLastKnownGoodRecovery(snapshot)) {
opts.log.warn(
`config reload recovery skipped after ${reason}: invalidity is scoped to plugin entries`,
);
return null;
}
const recovered = await opts.recoverSnapshot(snapshot, reason);
if (!recovered) {
return null;
}
const issueSummary = formatConfigIssueSummary([...snapshot.issues, ...snapshot.legacyIssues]);
opts.log.warn(
`config reload restored last-known-good config after ${reason}${issueSummary ? `; Rejected validation details: ${issueSummary}.` : ""}`,
);
const nextSnapshot = await opts.readSnapshot();
if (!nextSnapshot.valid) {
const issues = formatConfigIssueLines(nextSnapshot.issues, "").join(", ");
opts.log.warn(`config reload recovery snapshot is invalid: ${issues}`);
return null;
}
try {
await opts.onRecovered?.({ reason, snapshot, recoveredSnapshot: nextSnapshot });
} catch (err) {
opts.log.warn(`config reload recovery notice failed: ${String(err)}`);
}
return nextSnapshot;
};
const applySnapshot = async (
nextConfig: OpenClawConfig,
nextCompareConfig: OpenClawConfig,
@@ -428,28 +348,12 @@ export function startGatewayConfigReloader(opts: {
if (handleMissingSnapshot(snapshot)) {
return;
}
let degradedPluginSnapshot = false;
if (!snapshot.valid) {
const recoveredSnapshot = await recoverAndReadSnapshot(snapshot, "invalid-config");
if (!recoveredSnapshot) {
const pluginLocalSnapshot = resolvePluginLocalInvalidReloadSnapshot({
snapshot,
log: opts.log,
});
if (!pluginLocalSnapshot) {
handleInvalidSnapshot(snapshot);
return;
}
snapshot = pluginLocalSnapshot;
degradedPluginSnapshot = true;
} else {
snapshot = recoveredSnapshot;
}
handleInvalidSnapshot(snapshot);
return;
}
await applySnapshot(snapshot.config, snapshot.sourceConfig);
if (!degradedPluginSnapshot) {
await promoteAcceptedSnapshot(snapshot, "valid-config");
}
await promoteAcceptedSnapshot(snapshot, "valid-config");
} catch (err) {
opts.log.error(`config reload failed: ${String(err)}`);
} finally {

View File

@@ -1,13 +1,7 @@
import { afterEach, describe, expect, it, vi } from "vitest";
import {
__testing as embeddedRunTesting,
clearActiveEmbeddedRun,
setActiveEmbeddedRun,
type EmbeddedPiQueueHandle,
} from "../agents/pi-embedded-runner/runs.js";
import type { ChannelKind } from "./config-reload-plan.js";
import type { GatewayPluginReloadResult } from "./server-reload-handlers.js";
import { __testing, createGatewayReloadHandlers } from "./server-reload-handlers.js";
import { createGatewayReloadHandlers } from "./server-reload-handlers.js";
const hoisted = vi.hoisted(() => ({
activeTaskCount: { value: 0 },
@@ -90,49 +84,6 @@ afterEach(() => {
hoisted.activeTaskBlockers.length = 0;
});
describe("gateway reload recovery handlers", () => {
afterEach(() => {
embeddedRunTesting.resetActiveEmbeddedRuns();
});
it("aborts active agent runs after last-known-good config recovery", () => {
const sessionId = "config-recovery-session";
const sessionKey = "agent:main:telegram:direct:123";
let handle!: EmbeddedPiQueueHandle;
handle = {
abort: vi.fn(() => {
clearActiveEmbeddedRun(sessionId, handle, sessionKey);
}),
isCompacting: () => false,
isStreaming: () => false,
queueMessage: async () => {},
};
const logReload = { info: vi.fn(), warn: vi.fn() };
setActiveEmbeddedRun(sessionId, handle, sessionKey);
__testing.abortActiveAgentRunsAfterConfigRecovery({
reason: "invalid-config",
logReload,
});
expect(handle.abort).toHaveBeenCalledOnce();
expect(logReload.warn).toHaveBeenCalledWith(
"config recovery aborted active agent run(s) after reload-invalid-config",
);
});
it("does not warn when config recovery has no active agent runs to abort", () => {
const logReload = { info: vi.fn(), warn: vi.fn() };
__testing.abortActiveAgentRunsAfterConfigRecovery({
reason: "invalid-config",
logReload,
});
expect(logReload.warn).not.toHaveBeenCalled();
});
});
describe("gateway restart deferral preflight", () => {
it("logs active task run ids before waiting and when forcing after timeout", async () => {
const restartTesting = (await import("../infra/restart.js")).__testing;

View File

@@ -1,7 +1,6 @@
import { resetModelCatalogCache } from "../agents/model-catalog.js";
import { disposeAllSessionMcpRuntimes } from "../agents/pi-bundle-mcp-tools.js";
import { getActiveEmbeddedRunCount } from "../agents/pi-embedded-runner/run-state.js";
import { abortEmbeddedPiRun } from "../agents/pi-embedded-runner/runs.js";
import { getTotalPendingReplies } from "../auto-reply/reply/dispatcher-registry.js";
import type { CliDeps } from "../cli/deps.types.js";
import { isRestartEnabled } from "../config/commands.flags.js";
@@ -27,7 +26,6 @@ import {
type ActiveTaskRestartBlocker,
} from "../tasks/task-registry.maintenance.js";
import type { ChannelHealthMonitor } from "./channel-health-monitor.js";
import { enqueueConfigRecoveryNotice } from "./config-recovery-notice.js";
import type { ChannelKind } from "./config-reload-plan.js";
import { startGatewayConfigReloader, type GatewayReloadPlan } from "./config-reload.js";
import { resolveHooksConfig } from "./hooks.js";
@@ -71,23 +69,6 @@ const MCP_RUNTIME_RELOAD_DISPOSE_TIMEOUT_MS = 5_000;
const CHANNEL_RELOAD_DEFERRAL_POLL_MS = 500;
const CHANNEL_RELOAD_STILL_PENDING_WARN_MS = 30_000;
function abortActiveAgentRunsAfterConfigRecovery(params: {
reason: string;
logReload: GatewayReloadLog;
}) {
const aborted = abortEmbeddedPiRun(undefined, { mode: "all" });
if (!aborted) {
return;
}
params.logReload.warn(
`config recovery aborted active agent run(s) after reload-${params.reason}`,
);
}
export const __testing = {
abortActiveAgentRunsAfterConfigRecovery,
};
async function disposeMcpRuntimesWithTimeout(params: {
dispose: () => Promise<void>;
timeoutMs: number;
@@ -144,7 +125,6 @@ type ManagedGatewayConfigReloaderParams = Omit<
initialInternalWriteHash: string | null;
watchPath: string;
readSnapshot: typeof import("../config/config.js").readConfigFileSnapshot;
recoverSnapshot: typeof import("../config/config.js").recoverConfigFromLastKnownGood;
promoteSnapshot: typeof import("../config/config.js").promoteConfigSnapshotToLastKnownGood;
subscribeToWrites: typeof import("../config/config.js").registerConfigWriteListener;
logReload: GatewayReloadLog & {
@@ -521,19 +501,7 @@ export function startManagedGatewayConfigReloader(params: ManagedGatewayConfigRe
initialCompareConfig: params.initialCompareConfig,
initialInternalWriteHash: params.initialInternalWriteHash,
readSnapshot: params.readSnapshot,
recoverSnapshot: async (snapshot, reason) =>
await params.recoverSnapshot({ snapshot, reason: `reload-${reason}` }),
promoteSnapshot: async (snapshot, _reason) => await params.promoteSnapshot(snapshot),
onRecovered: ({ reason, snapshot, recoveredSnapshot }) => {
abortActiveAgentRunsAfterConfigRecovery({ reason, logReload: params.logReload });
enqueueConfigRecoveryNotice({
cfg: recoveredSnapshot.config,
phase: "reload",
reason: `reload-${reason}`,
configPath: snapshot.path,
issues: [...snapshot.issues, ...snapshot.legacyIssues],
});
},
subscribeToWrites: params.subscribeToWrites,
onHotReload: async (plan, nextConfig) => {
const previousSharedGatewaySessionGeneration =

View File

@@ -57,8 +57,6 @@ const pluginMetadataSnapshot = vi.hoisted(
vi.mock("../config/io.js", () => ({
readConfigFileSnapshot: vi.fn(),
readConfigFileSnapshotWithPluginMetadata: vi.fn(),
recoverConfigFromLastKnownGood: vi.fn(),
recoverConfigFromJsonRootSuffix: vi.fn(),
writeConfigFile: vi.fn(),
}));
@@ -73,49 +71,17 @@ vi.mock("../config/runtime-overrides.js", () => ({
applyConfigOverrides: vi.fn((config: OpenClawConfig) => config),
}));
vi.mock("../config/recovery-policy.js", () => ({
isPluginLocalInvalidConfigSnapshot: vi.fn((snapshot: ConfigFileSnapshot) => {
if (snapshot.valid || snapshot.legacyIssues.length > 0 || snapshot.issues.length === 0) {
return false;
}
return snapshot.issues.every((issue) => issue.path.startsWith("plugins.entries."));
}),
shouldAttemptLastKnownGoodRecovery: vi.fn((snapshot: ConfigFileSnapshot) => {
if (snapshot.valid) {
return false;
}
return !(
snapshot.legacyIssues.length === 0 &&
snapshot.issues.length > 0 &&
snapshot.issues.every((issue) => issue.path.startsWith("plugins.entries."))
);
}),
}));
vi.mock("../config/mutate.js", () => ({
replaceConfigFile: vi.fn(),
}));
vi.mock("../config/validation.js", () => ({
validateConfigObjectWithPlugins: vi.fn((config: OpenClawConfig) => ({
ok: true,
config,
warnings: [],
})),
}));
vi.mock("../config/plugin-auto-enable.js", () => ({
applyPluginAutoEnable: (params: { config: OpenClawConfig }) => applyPluginAutoEnable(params),
}));
vi.mock("./config-recovery-notice.js", () => ({
enqueueConfigRecoveryNotice: vi.fn(),
}));
let loadGatewayStartupConfigSnapshot: typeof import("./server-startup-config.js").loadGatewayStartupConfigSnapshot;
let configIo: typeof import("../config/io.js");
let configMutate: typeof import("../config/mutate.js");
let recoveryNotice: typeof import("./config-recovery-notice.js");
const configPath = "/tmp/openclaw-startup-recovery.json";
const validConfig = {
@@ -171,14 +137,10 @@ function installConfigIoMockDefaults() {
const readSnapshotWithPluginMetadata = vi.mocked(
configIo.readConfigFileSnapshotWithPluginMetadata,
);
const recoverLastKnownGood = vi.mocked(configIo.recoverConfigFromLastKnownGood);
const recoverJsonRootSuffix = vi.mocked(configIo.recoverConfigFromJsonRootSuffix);
const writeConfig = vi.mocked(configIo.writeConfigFile);
readSnapshot.mockReset();
readSnapshotWithPluginMetadata.mockReset();
recoverLastKnownGood.mockReset();
recoverJsonRootSuffix.mockReset();
writeConfig.mockReset();
const defaultSnapshot = buildDefaultSnapshot();
@@ -193,17 +155,14 @@ function installConfigIoMockDefaults() {
}
return snapshot.valid ? { snapshot, pluginMetadataSnapshot } : { snapshot };
});
recoverLastKnownGood.mockResolvedValue(false);
recoverJsonRootSuffix.mockResolvedValue(false);
writeConfig.mockResolvedValue(undefined);
}
describe("gateway startup config recovery", () => {
describe("gateway startup config validation", () => {
beforeAll(async () => {
({ loadGatewayStartupConfigSnapshot } = await import("./server-startup-config.js"));
configIo = await import("../config/io.js");
configMutate = await import("../config/mutate.js");
recoveryNotice = await import("./config-recovery-notice.js");
});
beforeEach(() => {
@@ -435,51 +394,9 @@ describe("gateway startup config recovery", () => {
});
});
it("restores last-known-good config before startup validation", async () => {
const invalidSnapshot = buildSnapshot({ valid: false, raw: "{ invalid json" });
const recoveredSnapshot = buildSnapshot({
valid: true,
raw: `${JSON.stringify(validConfig)}\n`,
config: validConfig,
});
vi.mocked(configIo.readConfigFileSnapshot)
.mockResolvedValueOnce(invalidSnapshot)
.mockResolvedValueOnce(recoveredSnapshot);
vi.mocked(configIo.recoverConfigFromLastKnownGood).mockResolvedValueOnce(true);
const log = { info: vi.fn(), warn: vi.fn() };
await expect(
loadGatewayStartupConfigSnapshot({
minimalTestGateway: true,
log,
}),
).resolves.toEqual({
snapshot: recoveredSnapshot,
wroteConfig: true,
pluginMetadataSnapshot,
});
expect(configIo.recoverConfigFromLastKnownGood).toHaveBeenCalledWith({
snapshot: invalidSnapshot,
reason: "startup-invalid-config",
});
expect(log.warn).toHaveBeenCalledWith(
`gateway: invalid config was restored from last-known-good backup: ${configPath}; Rejected validation details: gateway.mode: Expected 'local' or 'remote'.`,
);
expect(recoveryNotice.enqueueConfigRecoveryNotice).toHaveBeenCalledWith({
cfg: recoveredSnapshot.config,
phase: "startup",
reason: "startup-invalid-config",
configPath,
issues: [{ path: "gateway.mode", message: "Expected 'local' or 'remote'" }],
});
});
it("keeps startup validation loud when last-known-good recovery is unavailable", async () => {
it("rejects invalid config before startup without automatic recovery", async () => {
const invalidSnapshot = buildSnapshot({ valid: false, raw: "{ invalid json" });
vi.mocked(configIo.readConfigFileSnapshot).mockResolvedValueOnce(invalidSnapshot);
vi.mocked(configIo.recoverConfigFromLastKnownGood).mockResolvedValueOnce(false);
vi.mocked(configIo.recoverConfigFromJsonRootSuffix).mockResolvedValueOnce(false);
await expect(
loadGatewayStartupConfigSnapshot({
@@ -489,11 +406,9 @@ describe("gateway startup config recovery", () => {
).rejects.toThrow(
`Invalid config at ${configPath}.\ngateway.mode: Expected 'local' or 'remote'\nRun "openclaw doctor --fix" to repair, then retry.`,
);
expect(recoveryNotice.enqueueConfigRecoveryNotice).not.toHaveBeenCalled();
});
it("rejects legacy config entries in Nix mode before recovery", async () => {
it("rejects legacy config entries in Nix mode", async () => {
const legacySnapshot = buildTestConfigSnapshot({
path: configPath,
exists: true,
@@ -534,13 +449,9 @@ describe("gateway startup config recovery", () => {
).rejects.toThrow(
"Legacy config entries detected while running in Nix mode. Update your Nix config to the latest schema and restart.",
);
expect(configIo.recoverConfigFromLastKnownGood).not.toHaveBeenCalled();
expect(configIo.recoverConfigFromJsonRootSuffix).not.toHaveBeenCalled();
expect(recoveryNotice.enqueueConfigRecoveryNotice).not.toHaveBeenCalled();
});
it("continues startup in degraded mode for plugin-local startup invalidity", async () => {
it("rejects plugin-local startup invalidity without degraded startup", async () => {
const invalidSnapshot = buildTestConfigSnapshot({
path: configPath,
exists: true,
@@ -579,29 +490,12 @@ describe("gateway startup config recovery", () => {
legacyIssues: [],
});
vi.mocked(configIo.readConfigFileSnapshot).mockResolvedValueOnce(invalidSnapshot);
const log = { info: vi.fn(), warn: vi.fn() };
await expect(
loadGatewayStartupConfigSnapshot({
minimalTestGateway: true,
log,
log: { info: vi.fn(), warn: vi.fn() },
}),
).resolves.toEqual({
snapshot: expect.objectContaining({
valid: true,
issues: [],
warnings: invalidSnapshot.issues,
}),
wroteConfig: false,
degradedPluginConfig: true,
});
expect(configIo.recoverConfigFromLastKnownGood).not.toHaveBeenCalled();
expect(configIo.recoverConfigFromJsonRootSuffix).not.toHaveBeenCalled();
expect(log.warn).toHaveBeenCalledWith(
`gateway: skipped plugin config validation issue at plugins.entries.feishu: plugin feishu: plugin requires OpenClaw >=2026.4.23, but this host is 2026.4.22; skipping load. Run "openclaw doctor --fix" to quarantine the plugin config.`,
);
expect(recoveryNotice.enqueueConfigRecoveryNotice).not.toHaveBeenCalled();
).rejects.toThrow(`Invalid config at ${configPath}.`);
});
it("keeps mixed plugin and core startup invalidity fatal", async () => {
@@ -646,8 +540,6 @@ describe("gateway startup config recovery", () => {
legacyIssues: [],
});
vi.mocked(configIo.readConfigFileSnapshot).mockResolvedValueOnce(invalidSnapshot);
vi.mocked(configIo.recoverConfigFromLastKnownGood).mockResolvedValueOnce(false);
vi.mocked(configIo.recoverConfigFromJsonRootSuffix).mockResolvedValueOnce(false);
await expect(
loadGatewayStartupConfigSnapshot({
@@ -655,14 +547,9 @@ describe("gateway startup config recovery", () => {
log: { info: vi.fn(), warn: vi.fn() },
}),
).rejects.toThrow(`Invalid config at ${configPath}.`);
expect(configIo.recoverConfigFromLastKnownGood).toHaveBeenCalledWith({
snapshot: invalidSnapshot,
reason: "startup-invalid-config",
});
});
it("skips providers with stale model api enum values during startup", async () => {
it("rejects stale model provider api enum values during startup", async () => {
const config = {
gateway: { mode: "local" },
models: {
@@ -713,58 +600,28 @@ describe("gateway startup config recovery", () => {
legacyIssues: [],
});
vi.mocked(configIo.readConfigFileSnapshot).mockResolvedValueOnce(invalidSnapshot);
const log = { info: vi.fn(), warn: vi.fn() };
await expect(
loadGatewayStartupConfigSnapshot({
minimalTestGateway: false,
log: { info: vi.fn(), warn: vi.fn() },
}),
).rejects.toThrow(`Invalid config at ${configPath}.`);
const result = await loadGatewayStartupConfigSnapshot({
minimalTestGateway: false,
log,
});
expect(result.wroteConfig).toBe(false);
expect(result.degradedProviderApi).toBe(true);
expect(result.snapshot.valid).toBe(true);
expect(result.snapshot.sourceConfig.models?.providers?.openrouter).toBeUndefined();
expect(result.snapshot.sourceConfig.models?.providers?.anthropic).toEqual(
config.models?.providers?.anthropic,
);
expect(configIo.recoverConfigFromLastKnownGood).not.toHaveBeenCalled();
expect(configMutate.replaceConfigFile).not.toHaveBeenCalled();
expect(log.warn).toHaveBeenCalledWith(
'gateway: skipped model provider openrouter; configured provider api is invalid. Run "openclaw doctor --fix" to repair the config.',
);
});
it("strips a valid JSON suffix when last-known-good recovery is unavailable", async () => {
it("rejects prefixed JSON without startup suffix repair", async () => {
const invalidSnapshot = buildSnapshot({
valid: false,
raw: `Found and updated: False\n${JSON.stringify(validConfig)}\n`,
});
const repairedSnapshot = buildSnapshot({
valid: true,
raw: `${JSON.stringify(validConfig)}\n`,
config: validConfig,
});
vi.mocked(configIo.readConfigFileSnapshot)
.mockResolvedValueOnce(invalidSnapshot)
.mockResolvedValueOnce(repairedSnapshot);
vi.mocked(configIo.recoverConfigFromLastKnownGood).mockResolvedValueOnce(false);
vi.mocked(configIo.recoverConfigFromJsonRootSuffix).mockResolvedValueOnce(true);
const log = { info: vi.fn(), warn: vi.fn() };
vi.mocked(configIo.readConfigFileSnapshot).mockResolvedValueOnce(invalidSnapshot);
await expect(
loadGatewayStartupConfigSnapshot({
minimalTestGateway: true,
log,
log: { info: vi.fn(), warn: vi.fn() },
}),
).resolves.toEqual({
snapshot: repairedSnapshot,
wroteConfig: true,
pluginMetadataSnapshot,
});
expect(configIo.recoverConfigFromJsonRootSuffix).toHaveBeenCalledWith(invalidSnapshot);
expect(log.warn).toHaveBeenCalledWith(
`gateway: invalid config was repaired by stripping a non-JSON prefix: ${configPath}`,
);
).rejects.toThrow(`Invalid config at ${configPath}.`);
});
});

View File

@@ -1,36 +1,21 @@
import { loadAuthProfileStoreWithoutExternalProfiles } from "../agents/auth-profiles.js";
import { formatCliCommand } from "../cli/command-format.js";
import {
type ReadConfigFileSnapshotWithPluginMetadataResult,
readConfigFileSnapshotWithPluginMetadata,
recoverConfigFromLastKnownGood,
recoverConfigFromJsonRootSuffix,
} from "../config/io.js";
import { formatConfigIssueLines, formatConfigIssueSummary } from "../config/issue-format.js";
import { asResolvedSourceConfig, materializeRuntimeConfig } from "../config/materialize.js";
import { replaceConfigFile } from "../config/mutate.js";
import { formatConfigIssueLines } from "../config/issue-format.js";
import { isNixMode } from "../config/paths.js";
import { applyPluginAutoEnable } from "../config/plugin-auto-enable.js";
import {
isPluginLocalInvalidConfigSnapshot,
shouldAttemptLastKnownGoodRecovery,
} from "../config/recovery-policy.js";
import { applyConfigOverrides } from "../config/runtime-overrides.js";
import type { GatewayAuthConfig, GatewayTailscaleConfig } from "../config/types.gateway.js";
import type { ConfigFileSnapshot, OpenClawConfig } from "../config/types.openclaw.js";
import { validateConfigObjectWithPlugins } from "../config/validation.js";
import { isTruthyEnvValue } from "../infra/env.js";
import type { PluginMetadataSnapshot } from "../plugins/plugin-metadata-snapshot.js";
import {
GATEWAY_AUTH_SURFACE_PATHS,
evaluateGatewayAuthSurfaceStates,
} from "../secrets/runtime-gateway-auth-surfaces.js";
import {
activateSecretsRuntimeSnapshot,
prepareSecretsRuntimeSnapshot,
} from "../secrets/runtime.js";
import { resolveGatewayAuth } from "./auth.js";
import { enqueueConfigRecoveryNotice } from "./config-recovery-notice.js";
import { assertGatewayAuthNotKnownWeak } from "./known-weak-gateway-secrets.js";
import {
ensureGatewayStartupAuth,
@@ -49,10 +34,14 @@ type GatewaySecretsStateEventCode = "SECRETS_RELOADER_DEGRADED" | "SECRETS_RELOA
export type ActivateRuntimeSecrets = (
config: OpenClawConfig,
params: { reason: "startup" | "reload" | "restart-check"; activate: boolean },
) => Promise<Awaited<ReturnType<typeof prepareSecretsRuntimeSnapshot>>>;
) => Promise<
Awaited<ReturnType<typeof import("../secrets/runtime.js").prepareSecretsRuntimeSnapshot>>
>;
type PrepareRuntimeSecretsSnapshot = typeof prepareSecretsRuntimeSnapshot;
type ActivateRuntimeSecretsSnapshot = typeof activateSecretsRuntimeSnapshot;
type PrepareRuntimeSecretsSnapshot =
typeof import("../secrets/runtime.js").prepareSecretsRuntimeSnapshot;
type ActivateRuntimeSecretsSnapshot =
typeof import("../secrets/runtime.js").activateSecretsRuntimeSnapshot;
type GatewayStartupConfigOverrides = {
auth?: GatewayAuthConfig;
@@ -65,140 +54,8 @@ export type GatewayStartupConfigSnapshotLoadResult = {
snapshot: ConfigFileSnapshot;
wroteConfig: boolean;
pluginMetadataSnapshot?: PluginMetadataSnapshot;
degradedProviderApi?: boolean;
degradedPluginConfig?: boolean;
};
const MODEL_PROVIDER_API_PATH_RE = /^models\.providers\.([^.]+)\.api$/;
const MODEL_PROVIDER_MODEL_API_PATH_RE = /^models\.providers\.([^.]+)\.models\.\d+\.api$/;
function resolveInvalidModelProviderApiIssueProviderId(issue: {
path: string;
message: string;
}): string | null {
if (!issue.message.startsWith("Invalid option:")) {
return null;
}
const providerMatch =
issue.path.match(MODEL_PROVIDER_API_PATH_RE) ??
issue.path.match(MODEL_PROVIDER_MODEL_API_PATH_RE);
return providerMatch?.[1] ?? null;
}
function cloneConfigWithoutModelProviders(
config: OpenClawConfig,
providerIds: ReadonlySet<string>,
): OpenClawConfig {
const providers = config.models?.providers;
if (!providers) {
return config;
}
let changed = false;
const nextProviders = { ...providers };
for (const providerId of providerIds) {
if (!Object.hasOwn(nextProviders, providerId)) {
continue;
}
delete nextProviders[providerId];
changed = true;
}
if (!changed) {
return config;
}
return {
...config,
models: {
...config.models,
providers: nextProviders,
},
};
}
function resolveGatewayStartupConfigWithoutInvalidModelProviders(params: {
snapshot: ConfigFileSnapshot;
log: GatewayStartupLog;
}): ConfigFileSnapshot | null {
if (params.snapshot.valid || params.snapshot.legacyIssues.length > 0) {
return null;
}
const providerIds = new Set<string>();
for (const issue of params.snapshot.issues) {
const providerId = resolveInvalidModelProviderApiIssueProviderId(issue);
if (!providerId) {
return null;
}
providerIds.add(providerId);
}
if (providerIds.size === 0) {
return null;
}
const prunedSourceConfig = cloneConfigWithoutModelProviders(
params.snapshot.sourceConfig,
providerIds,
);
const validated = validateConfigObjectWithPlugins(prunedSourceConfig);
if (!validated.ok) {
return null;
}
const runtimeConfig = materializeRuntimeConfig(validated.config, "load");
for (const providerId of providerIds) {
params.log.warn(
`gateway: skipped model provider ${providerId}; configured provider api is invalid. Run "openclaw doctor --fix" to repair the config.`,
);
}
return {
...params.snapshot,
sourceConfig: asResolvedSourceConfig(validated.config),
resolved: asResolvedSourceConfig(validated.config),
valid: true,
runtimeConfig,
config: runtimeConfig,
issues: [],
warnings: validated.warnings,
};
}
function collectConfigSnapshotIssueDetails(snapshot: ConfigFileSnapshot) {
return [...snapshot.issues, ...snapshot.legacyIssues];
}
function formatConfigRecoveryLogIssueSuffix(snapshot: ConfigFileSnapshot): string {
const summary = formatConfigIssueSummary(collectConfigSnapshotIssueDetails(snapshot));
return summary ? `; Rejected validation details: ${summary}.` : "";
}
function resolveGatewayStartupConfigWithoutInvalidPluginEntries(params: {
snapshot: ConfigFileSnapshot;
log: GatewayStartupLog;
}): ConfigFileSnapshot | null {
if (!isPluginLocalInvalidConfigSnapshot(params.snapshot)) {
return null;
}
const validated = validateConfigObjectWithPlugins(params.snapshot.sourceConfig, {
pluginValidation: "skip",
});
if (!validated.ok) {
return null;
}
const runtimeConfig = materializeRuntimeConfig(validated.config, "load");
for (const issue of params.snapshot.issues) {
params.log.warn(
`gateway: skipped plugin config validation issue at ${issue.path}: ${issue.message}. Run "openclaw doctor --fix" to quarantine the plugin config.`,
);
}
return {
...params.snapshot,
sourceConfig: asResolvedSourceConfig(validated.config),
resolved: asResolvedSourceConfig(validated.config),
valid: true,
runtimeConfig,
config: runtimeConfig,
issues: [],
warnings: [...params.snapshot.warnings, ...params.snapshot.issues],
};
}
export async function loadGatewayStartupConfigSnapshot(params: {
minimalTestGateway: boolean;
log: GatewayStartupLog;
@@ -214,107 +71,36 @@ export async function loadGatewayStartupConfigSnapshot(params: {
let configSnapshot = snapshotRead.snapshot;
let pluginMetadataSnapshot = snapshotRead.pluginMetadataSnapshot;
let wroteConfig = false;
let degradedStartupConfig = false;
let degradedPluginConfig = false;
if (configSnapshot.legacyIssues.length > 0 && isNixMode) {
throw new Error(
"Legacy config entries detected while running in Nix mode. Update your Nix config to the latest schema and restart.",
);
}
if (configSnapshot.exists) {
if (!configSnapshot.valid) {
const providerApiPrunedSnapshot = resolveGatewayStartupConfigWithoutInvalidModelProviders({
snapshot: configSnapshot,
log: params.log,
});
if (providerApiPrunedSnapshot) {
degradedStartupConfig = true;
configSnapshot = providerApiPrunedSnapshot;
}
}
if (!configSnapshot.valid) {
const pluginConfigDegradedSnapshot = resolveGatewayStartupConfigWithoutInvalidPluginEntries({
snapshot: configSnapshot,
log: params.log,
});
if (pluginConfigDegradedSnapshot) {
degradedPluginConfig = true;
configSnapshot = pluginConfigDegradedSnapshot;
}
}
if (!configSnapshot.valid) {
const rejectedSnapshot = configSnapshot;
const rejectedConfigIssues = collectConfigSnapshotIssueDetails(rejectedSnapshot);
const canRecoverFromLastKnownGood = shouldAttemptLastKnownGoodRecovery(configSnapshot);
const recovered = canRecoverFromLastKnownGood
? await recoverConfigFromLastKnownGood({
snapshot: configSnapshot,
reason: "startup-invalid-config",
})
: false;
if (!canRecoverFromLastKnownGood) {
params.log.warn(
`gateway: last-known-good recovery skipped for plugin-local config invalidity: ${configSnapshot.path}`,
);
}
if (recovered) {
wroteConfig = true;
params.log.warn(
`gateway: invalid config was restored from last-known-good backup: ${rejectedSnapshot.path}${formatConfigRecoveryLogIssueSuffix(rejectedSnapshot)}`,
);
snapshotRead = await measure("config.snapshot.recovery-read", () =>
readConfigFileSnapshotWithPluginMetadata({ measure }),
);
configSnapshot = snapshotRead.snapshot;
pluginMetadataSnapshot = snapshotRead.pluginMetadataSnapshot;
if (configSnapshot.valid) {
enqueueConfigRecoveryNotice({
cfg: configSnapshot.config,
phase: "startup",
reason: "startup-invalid-config",
configPath: configSnapshot.path,
issues: rejectedConfigIssues,
});
}
}
if (!recovered && (await recoverConfigFromJsonRootSuffix(configSnapshot))) {
wroteConfig = true;
params.log.warn(
`gateway: invalid config was repaired by stripping a non-JSON prefix: ${configSnapshot.path}`,
);
snapshotRead = await measure("config.snapshot.prefix-recovery-read", () =>
readConfigFileSnapshotWithPluginMetadata({ measure }),
);
configSnapshot = snapshotRead.snapshot;
pluginMetadataSnapshot = snapshotRead.pluginMetadataSnapshot;
}
}
assertValidGatewayStartupConfigSnapshot(configSnapshot, { includeDoctorHint: true });
}
const autoEnable =
params.minimalTestGateway || degradedStartupConfig || degradedPluginConfig
? { config: configSnapshot.config, changes: [] as string[] }
: await measure("config.snapshot.auto-enable", () =>
applyPluginAutoEnable({
config: configSnapshot.sourceConfig,
env: process.env,
...(pluginMetadataSnapshot?.manifestRegistry
? { manifestRegistry: pluginMetadataSnapshot.manifestRegistry }
: {}),
}),
);
const autoEnable = params.minimalTestGateway
? { config: configSnapshot.config, changes: [] as string[] }
: await measure("config.snapshot.auto-enable", () =>
applyPluginAutoEnable({
config: configSnapshot.sourceConfig,
env: process.env,
...(pluginMetadataSnapshot?.manifestRegistry
? { manifestRegistry: pluginMetadataSnapshot.manifestRegistry }
: {}),
}),
);
if (autoEnable.changes.length === 0) {
return {
snapshot: configSnapshot,
wroteConfig,
...(pluginMetadataSnapshot ? { pluginMetadataSnapshot } : {}),
...(degradedStartupConfig ? { degradedProviderApi: true } : {}),
...(degradedPluginConfig ? { degradedPluginConfig: true } : {}),
};
}
try {
const { replaceConfigFile } = await import("../config/mutate.js");
await replaceConfigFile({
nextConfig: autoEnable.config,
afterWrite: { mode: "auto" },
@@ -337,8 +123,6 @@ export async function loadGatewayStartupConfigSnapshot(params: {
snapshot: configSnapshot,
wroteConfig,
...(pluginMetadataSnapshot ? { pluginMetadataSnapshot } : {}),
...(degradedStartupConfig ? { degradedProviderApi: true } : {}),
...(degradedPluginConfig ? { degradedPluginConfig: true } : {}),
};
}
@@ -354,10 +138,16 @@ export function createRuntimeSecretsActivator(params: {
}): ActivateRuntimeSecrets {
let secretsDegraded = false;
let secretsActivationTail: Promise<void> = Promise.resolve();
const prepareRuntimeSecretsSnapshot =
params.prepareRuntimeSecretsSnapshot ?? prepareSecretsRuntimeSnapshot;
const activateRuntimeSecretsSnapshot =
params.activateRuntimeSecretsSnapshot ?? activateSecretsRuntimeSnapshot;
let secretsRuntimePromise: Promise<typeof import("../secrets/runtime.js")> | null = null;
let authProfilesPromise: Promise<typeof import("../agents/auth-profiles.js")> | null = null;
const loadSecretsRuntime = () => {
secretsRuntimePromise ??= import("../secrets/runtime.js");
return secretsRuntimePromise;
};
const loadAuthProfiles = () => {
authProfilesPromise ??= import("../agents/auth-profiles.js");
return authProfilesPromise;
};
const runWithSecretsActivationLock = async <T>(operation: () => Promise<T>): Promise<T> => {
const run = secretsActivationTail.then(operation, operation);
@@ -371,13 +161,22 @@ export function createRuntimeSecretsActivator(params: {
return async (config, activationParams) =>
await runWithSecretsActivationLock(async () => {
try {
const secretsRuntime =
params.prepareRuntimeSecretsSnapshot && params.activateRuntimeSecretsSnapshot
? null
: await loadSecretsRuntime();
const prepareRuntimeSecretsSnapshot =
params.prepareRuntimeSecretsSnapshot ?? secretsRuntime!.prepareSecretsRuntimeSnapshot;
const activateRuntimeSecretsSnapshot =
params.activateRuntimeSecretsSnapshot ?? secretsRuntime!.activateSecretsRuntimeSnapshot;
const startupPreflight =
activationParams.reason === "startup" || activationParams.reason === "restart-check";
const loadAuthStore = startupPreflight
? (await loadAuthProfiles()).loadAuthProfileStoreWithoutExternalProfiles
: undefined;
const prepared = await prepareRuntimeSecretsSnapshot({
config: pruneSkippedStartupSecretSurfaces(config),
...(startupPreflight
? { loadAuthStore: loadAuthProfileStoreWithoutExternalProfiles }
: {}),
...(loadAuthStore ? { loadAuthStore } : {}),
});
assertRuntimeGatewayAuthNotKnownWeak(prepared.config);
if (activationParams.activate) {

View File

@@ -1,6 +1,5 @@
import { resolveAgentWorkspaceDir, resolveDefaultAgentId } from "../agents/agent-scope.js";
import { initSubagentRegistry } from "../agents/subagent-registry.js";
import { runChannelPluginStartupMaintenance } from "../channels/plugins/lifecycle-startup.js";
import { applyPluginAutoEnable } from "../config/plugin-auto-enable.js";
import type { OpenClawConfig } from "../config/types.openclaw.js";
import { loadPluginLookUpTable } from "../plugins/plugin-lookup-table.js";
@@ -9,8 +8,6 @@ import { createEmptyPluginRegistry } from "../plugins/registry.js";
import { getActivePluginRegistry, setActivePluginRegistry } from "../plugins/runtime.js";
import { mergeActivationSectionsIntoRuntimeConfig } from "./plugin-activation-runtime-config.js";
import { listGatewayMethods } from "./server-methods-list.js";
import { loadGatewayStartupPlugins } from "./server-plugin-bootstrap.js";
import { runStartupSessionMigration } from "./server-startup-session-migration.js";
type GatewayPluginBootstrapLog = {
info: (message: string) => void;
@@ -54,6 +51,8 @@ export async function prepareGatewayPluginBootstrap(params: {
const shouldRunStartupMaintenance =
!params.minimalTestGateway || startupMaintenanceConfig.channels !== undefined;
if (shouldRunStartupMaintenance) {
const { runChannelPluginStartupMaintenance } =
await import("../channels/plugins/lifecycle-startup.js");
const startupTasks = [
runChannelPluginStartupMaintenance({
cfg: startupMaintenanceConfig,
@@ -62,6 +61,7 @@ export async function prepareGatewayPluginBootstrap(params: {
}),
];
if (!params.minimalTestGateway) {
const { runStartupSessionMigration } = await import("./server-startup-session-migration.js");
startupTasks.push(
runStartupSessionMigration({
cfg: params.cfgAtStart,
@@ -157,6 +157,7 @@ export async function loadGatewayStartupPluginRuntime(params: {
suppressPluginInfoLogs?: boolean;
startupTrace?: GatewayStartupTrace;
}) {
const { loadGatewayStartupPlugins } = await import("./server-plugin-bootstrap.js");
return loadGatewayStartupPlugins({
cfg: params.cfg,
activationSourceConfig: params.activationSourceConfig,

View File

@@ -11,12 +11,10 @@ import {
getRuntimeConfig,
promoteConfigSnapshotToLastKnownGood,
readConfigFileSnapshot,
recoverConfigFromLastKnownGood,
registerConfigWriteListener,
setRuntimeConfigSnapshot,
type ReadConfigFileSnapshotWithPluginMetadataResult,
} from "../config/io.js";
import { replaceConfigFile } from "../config/mutate.js";
import { isNixMode } from "../config/paths.js";
import { applyPluginAutoEnable } from "../config/plugin-auto-enable.js";
import { applyConfigOverrides } from "../config/runtime-overrides.js";
@@ -75,15 +73,6 @@ import {
getRequiredSharedGatewaySessionGeneration,
type SharedGatewaySessionGenerationState,
} from "./server-shared-auth-generation.js";
import {
createRuntimeSecretsActivator,
loadGatewayStartupConfigSnapshot,
prepareGatewayStartupConfig,
} from "./server-startup-config.js";
import {
loadGatewayStartupPluginRuntime,
prepareGatewayPluginBootstrap,
} from "./server-startup-plugins.js";
import { STARTUP_UNAVAILABLE_GATEWAY_METHODS } from "./server-startup-unavailable-methods.js";
import {
startGatewayEarlyRuntime,
@@ -504,6 +493,14 @@ export async function startGatewayServer(
description: "raw stream log path override",
});
const startupTrace = createGatewayStartupTrace();
const startupConfigModulePromise = import("./server-startup-config.js");
let startupPluginsModulePromise: Promise<typeof import("./server-startup-plugins.js")> | null =
null;
const loadStartupPluginsModule = () => {
startupPluginsModulePromise ??= import("./server-startup-plugins.js");
return startupPluginsModulePromise;
};
const { loadGatewayStartupConfigSnapshot } = await startupConfigModulePromise;
const startupConfigLoad = await startupTrace.measure("config.snapshot", () =>
loadGatewayStartupConfigSnapshot({
@@ -528,6 +525,7 @@ export async function startGatewayServer(
trusted: false,
});
};
const { createRuntimeSecretsActivator } = await startupConfigModulePromise;
const activateRuntimeSecrets = createRuntimeSecretsActivator({
logSecrets,
emitStateEvent: emitSecretsStateEvent,
@@ -539,13 +537,14 @@ export async function startGatewayServer(
const startupActivationSourceConfig = configSnapshot.sourceConfig;
const startupRuntimeConfig = applyConfigOverrides(configSnapshot.config);
startupTrace.setConfig(startupRuntimeConfig);
const { prepareGatewayStartupConfig } = await startupConfigModulePromise;
const authBootstrap = await startupTrace.measure("config.auth", () =>
prepareGatewayStartupConfig({
configSnapshot,
authOverride: opts.auth,
tailscaleOverride: opts.tailscale,
activateRuntimeSecrets,
persistStartupAuth: startupConfigLoad.degradedProviderApi !== true,
persistStartupAuth: true,
}),
);
cfgAtStart = authBootstrap.cfg;
@@ -583,6 +582,7 @@ export async function startGatewayServer(
maybeSeedControlUiAllowedOriginsAtStartup({
config: cfgAtStart,
writeConfig: async (nextConfig) => {
const { replaceConfigFile } = await import("../config/mutate.js");
await replaceConfigFile({
nextConfig,
afterWrite: { mode: "auto" },
@@ -609,6 +609,7 @@ export async function startGatewayServer(
startupLastGoodSnapshot = startupSnapshot;
}
setRuntimeConfigSnapshot(cfgAtStart, startupLastGoodSnapshot.sourceConfig);
const { prepareGatewayPluginBootstrap } = await loadStartupPluginsModule();
const pluginBootstrap = await startupTrace.measure("plugins.bootstrap", () =>
prepareGatewayPluginBootstrap({
cfgAtStart,
@@ -1364,8 +1365,9 @@ export async function startGatewayServer(
unavailableGatewayMethods,
loadStartupPlugins: runtimePluginsLoaded
? undefined
: () =>
loadGatewayStartupPluginRuntime({
: async () => {
const { loadGatewayStartupPluginRuntime } = await loadStartupPluginsModule();
return loadGatewayStartupPluginRuntime({
cfg: gatewayPluginConfigAtStart,
activationSourceConfig: startupActivationSourceConfig,
workspaceDir: defaultWorkspaceDir,
@@ -1374,7 +1376,8 @@ export async function startGatewayServer(
startupPluginIds,
pluginLookUpTable,
startupTrace,
}),
});
},
onStartupPluginsLoading: () => {
startupPendingReason = "startup-sidecars";
},
@@ -1409,7 +1412,6 @@ export async function startGatewayServer(
initialInternalWriteHash: startupInternalWriteHash,
watchPath: configSnapshot.path,
readSnapshot: readConfigFileSnapshot,
recoverSnapshot: recoverConfigFromLastKnownGood,
promoteSnapshot: promoteConfigSnapshotToLastKnownGood,
subscribeToWrites: registerConfigWriteListener,
deps,