fix: recover invalid gateway configs

This commit is contained in:
Peter Steinberger
2026-04-20 13:16:07 +01:00
parent dafc31502a
commit ffb1628727
19 changed files with 1023 additions and 21 deletions

View File

@@ -374,5 +374,17 @@
{
"source": "Testing",
"target": "测试"
},
{
"source": "/gateway/configuration#strict-validation",
"target": "/gateway/configuration#strict-validation"
},
{
"source": "/gateway/configuration#config-hot-reload",
"target": "/gateway/configuration#config-hot-reload"
},
{
"source": "/cli/config",
"target": "/cli/config"
}
]

View File

@@ -336,6 +336,34 @@ If dry-run fails:
- `Dry run note: skipped <n> exec SecretRef resolvability check(s)`: dry-run skipped exec refs; rerun with `--allow-exec` if you need exec resolvability validation.
- For batch mode, fix failing entries and rerun `--dry-run` before writing.
## Write safety
`openclaw config set` and other OpenClaw-owned config writers validate the full
post-change config before committing it to disk. If the new payload fails schema
validation or looks like a destructive clobber, the active config is left alone
and the rejected payload is saved beside it as `openclaw.json.rejected.*`.
Prefer CLI writes for small edits:
```bash
openclaw config set gateway.reload.mode hybrid --dry-run
openclaw config set gateway.reload.mode hybrid
openclaw config validate
```
If a write is rejected, inspect the saved payload and fix the full config shape:
```bash
CONFIG="$(openclaw config file)"
ls -lt "$CONFIG".rejected.* 2>/dev/null | head
openclaw config validate
```
Direct editor writes are still allowed, but the running Gateway treats them as
untrusted until they validate. Invalid direct edits can be restored from the
last-known-good backup during startup or hot reload. See
[Gateway troubleshooting](/gateway/troubleshooting#gateway-restored-last-known-good-config).
## Subcommands
- `config file`: Print the active config file path (resolved from `OPENCLAW_CONFIG_PATH` or default location).

View File

@@ -96,6 +96,17 @@ When validation fails:
- Run `openclaw doctor` to see exact issues
- Run `openclaw doctor --fix` (or `--yes`) to apply repairs
The Gateway also keeps a trusted last-known-good copy after a successful startup. If
`openclaw.json` is later changed outside OpenClaw and no longer validates, startup
and hot reload preserve the broken file as a timestamped `.clobbered.*` snapshot,
restore the last-known-good copy, and log a loud warning with the recovery reason.
The next main-agent turn also receives a system-event warning telling it that the
config was restored and must not be blindly rewritten. Last-known-good promotion
is updated after validated startup and after accepted hot reloads, including
OpenClaw-owned config writes whose persisted file hash still matches the accepted
write. Promotion is skipped when the candidate contains redacted secret
placeholders such as `***` or shortened token values.
## Common tasks
<AccordionGroup>
@@ -494,6 +505,19 @@ When validation fails:
The Gateway watches `~/.openclaw/openclaw.json` and applies changes automatically — no manual restart needed for most settings.
Direct file edits are treated as untrusted until they validate. The watcher waits
for editor temp-write/rename churn to settle, reads the final file, and rejects
invalid external edits by restoring the last-known-good config. OpenClaw-owned
config writes use the same schema gate before writing; destructive clobbers such
as dropping `gateway.mode` or shrinking the file by more than half are rejected
and saved as `.rejected.*` for inspection.
If you see `Config auto-restored from last-known-good` or
`config reload restored last-known-good config` in logs, inspect the matching
`.clobbered.*` file next to `openclaw.json`, fix the rejected payload, then run
`openclaw config validate`. See [Gateway troubleshooting](/gateway/troubleshooting#gateway-restored-last-known-good-config)
for the recovery checklist.
### Reload modes
| Mode | Behavior |

View File

@@ -262,6 +262,63 @@ Related:
- [/gateway/configuration](/gateway/configuration)
- [/gateway/doctor](/gateway/doctor)
## Gateway restored last-known-good config
Use this when the Gateway starts, but logs say it restored `openclaw.json`.
```bash
openclaw logs --follow
openclaw config file
openclaw config validate
openclaw doctor
```
Look for:
- `Config auto-restored from last-known-good`
- `gateway: invalid config was restored from last-known-good backup`
- `config reload restored last-known-good config after invalid-config`
- A timestamped `openclaw.json.clobbered.*` file beside the active config
- A main-agent system event that starts with `Config recovery warning`
What happened:
- The rejected config did not validate during startup or hot reload.
- OpenClaw preserved the rejected payload as `.clobbered.*`.
- The active config was restored from the last validated last-known-good copy.
- The next main-agent turn is warned not to blindly rewrite the rejected config.
Inspect and repair:
```bash
CONFIG="$(openclaw config file)"
ls -lt "$CONFIG".clobbered.* "$CONFIG".rejected.* 2>/dev/null | head
diff -u "$CONFIG" "$(ls -t "$CONFIG".clobbered.* 2>/dev/null | head -n 1)"
openclaw config validate
openclaw doctor
```
Common signatures:
- `.clobbered.*` exists → an external direct edit or startup read was restored.
- `.rejected.*` exists → an OpenClaw-owned config write failed schema or clobber checks before commit.
- `Config write rejected:` → the write tried to drop required shape, shrink the file sharply, or persist invalid config.
- `Config last-known-good promotion skipped` → the candidate contained redacted secret placeholders such as `***`.
Fix options:
1. Keep the restored active config if it is correct.
2. Copy only the intended keys from `.clobbered.*` or `.rejected.*`, then apply them with `openclaw config set` or `config.patch`.
3. Run `openclaw config validate` before restarting.
4. If you edit by hand, keep the full JSON5 config, not just the partial object you wanted to change.
Related:
- [/gateway/configuration#strict-validation](/gateway/configuration#strict-validation)
- [/gateway/configuration#config-hot-reload](/gateway/configuration#config-hot-reload)
- [/cli/config](/cli/config)
- [/gateway/doctor](/gateway/doctor)
## Gateway probe warnings
Use this when `openclaw gateway probe` reaches something, but still prints a warning block.

View File

@@ -1629,10 +1629,20 @@ for usage/billing and raise limits as needed.
`config.apply` replaces the **entire config**. If you send a partial object, everything
else is removed.
Current OpenClaw protects many accidental clobbers:
- OpenClaw-owned config writes validate the full post-change config before writing.
- Invalid or destructive OpenClaw-owned writes are rejected and saved as `openclaw.json.rejected.*`.
- If a direct edit breaks startup or hot reload, the Gateway restores the last-known-good config and saves the rejected file as `openclaw.json.clobbered.*`.
- The main agent receives a boot warning after recovery so it does not blindly write the bad config again.
Recover:
- Restore from backup (git or a copied `~/.openclaw/openclaw.json`).
- If you have no backup, re-run `openclaw doctor` and reconfigure channels/models.
- Check `openclaw logs --follow` for `Config auto-restored from last-known-good`, `Config write rejected:`, or `config reload restored last-known-good config`.
- Inspect the newest `openclaw.json.clobbered.*` or `openclaw.json.rejected.*` beside the active config.
- Keep the active restored config if it works, then copy only the intended keys back with `openclaw config set` or `config.patch`.
- Run `openclaw config validate` and `openclaw doctor`.
- If you have no last-known-good or rejected payload, restore from backup, or re-run `openclaw doctor` and reconfigure channels/models.
- If this was unexpected, file a bug and include your last known config or any backup.
- A local coding agent can often reconstruct a working config from logs or history.
@@ -1644,7 +1654,7 @@ for usage/billing and raise limits as needed.
- Use `config.patch` for partial RPC edits; keep `config.apply` for full-config replacement only.
- If you are using the owner-only `gateway` tool from an agent run, it will still reject writes to `tools.exec.ask` / `tools.exec.security` (including legacy `tools.bash.*` aliases that normalize to the same protected exec paths).
Docs: [Config](/cli/config), [Configure](/cli/configure), [Doctor](/gateway/doctor).
Docs: [Config](/cli/config), [Configure](/cli/configure), [Gateway troubleshooting](/gateway/troubleshooting#gateway-restored-last-known-good-config), [Doctor](/gateway/doctor).
</Accordion>