fix(gateway): harden token fallback/reconnect behavior and docs (#42507)

* fix(gateway): harden token fallback and auth reconnect handling

* docs(gateway): clarify auth retry and token-drift recovery

* fix(gateway): tighten auth reconnect gating across clients

* fix: harden gateway token retry (#42507) (thanks @joshavant)
This commit is contained in:
Josh Avant
2026-03-10 17:05:57 -05:00
committed by GitHub
parent ff2e7a2945
commit a76e810193
21 changed files with 1188 additions and 80 deletions

View File

@@ -92,3 +92,40 @@ Pass `--token` or `--password` explicitly. Missing explicit credentials is an er
- These commands require `operator.pairing` (or `operator.admin`) scope.
- `devices clear` is intentionally gated by `--yes`.
- If pairing scope is unavailable on local loopback (and no explicit `--url` is passed), list/approve can use a local pairing fallback.
## Token drift recovery checklist
Use this when Control UI or other clients keep failing with `AUTH_TOKEN_MISMATCH` or `AUTH_DEVICE_TOKEN_MISMATCH`.
1. Confirm current gateway token source:
```bash
openclaw config get gateway.auth.token
```
2. List paired devices and identify the affected device id:
```bash
openclaw devices list
```
3. Rotate operator token for the affected device:
```bash
openclaw devices rotate --device <deviceId> --role operator
```
4. If rotation is not enough, remove stale pairing and approve again:
```bash
openclaw devices remove <deviceId>
openclaw devices list
openclaw devices approve <requestId>
```
5. Retry client connection with the current shared token/password.
Related:
- [Dashboard auth troubleshooting](/web/dashboard#if-you-see-unauthorized-1008)
- [Gateway troubleshooting](/gateway/troubleshooting#dashboard-control-ui-connectivity)

View File

@@ -206,6 +206,12 @@ The Gateway treats these as **claims** and enforces server-side allowlists.
persisted by the client for future connects.
- Device tokens can be rotated/revoked via `device.token.rotate` and
`device.token.revoke` (requires `operator.pairing` scope).
- Auth failures include `error.details.code` plus recovery hints:
- `error.details.canRetryWithDeviceToken` (boolean)
- `error.details.recommendedNextStep` (`retry_with_device_token`, `update_auth_configuration`, `update_auth_credentials`, `wait_then_retry`, `review_auth_configuration`)
- Client behavior for `AUTH_TOKEN_MISMATCH`:
- Trusted clients may attempt one bounded retry with a cached per-device token.
- If that retry fails, clients should stop automatic reconnect loops and surface operator action guidance.
## Device identity + pairing
@@ -217,8 +223,9 @@ The Gateway treats these as **claims** and enforces server-side allowlists.
- **Local** connects include loopback and the gateway hosts own tailnet address
(so samehost tailnet binds can still autoapprove).
- All WS clients must include `device` identity during `connect` (operator + node).
Control UI can omit it **only** when `gateway.controlUi.dangerouslyDisableDeviceAuth`
is enabled for break-glass use.
Control UI can omit it only in these modes:
- `gateway.controlUi.allowInsecureAuth=true` for localhost-only insecure HTTP compatibility.
- `gateway.controlUi.dangerouslyDisableDeviceAuth=true` (break-glass, severe security downgrade).
- All connections must sign the server-provided `connect.challenge` nonce.
### Device auth migration diagnostics

View File

@@ -262,9 +262,14 @@ High-signal `checkId` values you will most likely see in real deployments (not e
## Control UI over HTTP
The Control UI needs a **secure context** (HTTPS or localhost) to generate device
identity. `gateway.controlUi.allowInsecureAuth` does **not** bypass secure-context,
device-identity, or device-pairing checks. Prefer HTTPS (Tailscale Serve) or open
the UI on `127.0.0.1`.
identity. `gateway.controlUi.allowInsecureAuth` is a local compatibility toggle:
- On localhost, it allows Control UI auth without device identity when the page
is loaded over non-secure HTTP.
- It does not bypass pairing checks.
- It does not relax remote (non-localhost) device identity requirements.
Prefer HTTPS (Tailscale Serve) or open the UI on `127.0.0.1`.
For break-glass scenarios only, `gateway.controlUi.dangerouslyDisableDeviceAuth`
disables device identity checks entirely. This is a severe security downgrade;

View File

@@ -113,9 +113,21 @@ Common signatures:
challenge-based device auth flow (`connect.challenge` + `device.nonce`).
- `device signature invalid` / `device signature expired` → client signed the wrong
payload (or stale timestamp) for the current handshake.
- `unauthorized` / reconnect loop → token/password mismatch.
- `AUTH_TOKEN_MISMATCH` with `canRetryWithDeviceToken=true` → client can do one trusted retry with cached device token.
- repeated `unauthorized` after that retry → shared token/device token drift; refresh token config and re-approve/rotate device token if needed.
- `gateway connect failed:` → wrong host/port/url target.
### Auth detail codes quick map
Use `error.details.code` from the failed `connect` response to pick the next action:
| Detail code | Meaning | Recommended action |
| ---------------------------- | -------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `AUTH_TOKEN_MISSING` | Client did not send a required shared token. | Paste/set token in the client and retry. For dashboard paths: `openclaw config get gateway.auth.token` then paste into Control UI settings. |
| `AUTH_TOKEN_MISMATCH` | Shared token did not match gateway auth token. | If `canRetryWithDeviceToken=true`, allow one trusted retry. If still failing, run the [token drift recovery checklist](/cli/devices#token-drift-recovery-checklist). |
| `AUTH_DEVICE_TOKEN_MISMATCH` | Cached per-device token is stale or revoked. | Rotate/re-approve device token using [devices CLI](/cli/devices), then reconnect. |
| `PAIRING_REQUIRED` | Device identity is known but not approved for this role. | Approve pending request: `openclaw devices list` then `openclaw devices approve <requestId>`. |
Device auth v2 migration check:
```bash
@@ -135,6 +147,7 @@ Related:
- [/web/control-ui](/web/control-ui)
- [/gateway/authentication](/gateway/authentication)
- [/gateway/remote](/gateway/remote)
- [/cli/devices](/cli/devices)
## Gateway service not running

View File

@@ -2512,6 +2512,7 @@ Your gateway is running with auth enabled (`gateway.auth.*`), but the UI is not
Facts (from code):
- The Control UI keeps the token in `sessionStorage` for the current browser tab session and selected gateway URL, so same-tab refreshes keep working without restoring long-lived localStorage token persistence.
- On `AUTH_TOKEN_MISMATCH`, trusted clients can attempt one bounded retry with a cached device token when the gateway returns retry hints (`canRetryWithDeviceToken=true`, `recommendedNextStep=retry_with_device_token`).
Fix:
@@ -2520,6 +2521,9 @@ Fix:
- If remote, tunnel first: `ssh -N -L 18789:127.0.0.1:18789 user@host` then open `http://127.0.0.1:18789/`.
- Set `gateway.auth.token` (or `OPENCLAW_GATEWAY_TOKEN`) on the gateway host.
- In the Control UI settings, paste the same token.
- If mismatch persists after the one retry, rotate/re-approve the paired device token:
- `openclaw devices list`
- `openclaw devices rotate --device <id> --role operator`
- Still stuck? Run `openclaw status --all` and follow [Troubleshooting](/gateway/troubleshooting). See [Dashboard](/web/dashboard) for auth details.
### I set gatewaybind tailnet but it can't bind nothing listens

View File

@@ -136,7 +136,8 @@ flowchart TD
Common log signatures:
- `device identity required` → HTTP/non-secure context cannot complete device auth.
- `unauthorized` / reconnect loopwrong token/password or auth mode mismatch.
- `AUTH_TOKEN_MISMATCH` with retry hints (`canRetryWithDeviceToken=true`) → one trusted device-token retry may occur automatically.
- repeated `unauthorized` after that retry → wrong token/password, auth mode mismatch, or stale paired device token.
- `gateway connect failed:` → UI is targeting the wrong URL/port or unreachable gateway.
Deep pages:

View File

@@ -174,7 +174,12 @@ OpenClaw **blocks** Control UI connections without device identity.
}
```
`allowInsecureAuth` does not bypass Control UI device identity or pairing checks.
`allowInsecureAuth` is a local compatibility toggle only:
- It allows localhost Control UI sessions to proceed without device identity in
non-secure HTTP contexts.
- It does not bypass pairing checks.
- It does not relax remote (non-localhost) device identity requirements.
**Break-glass only:**

View File

@@ -45,6 +45,8 @@ Prefer localhost, Tailscale Serve, or an SSH tunnel.
## If you see “unauthorized” / 1008
- Ensure the gateway is reachable (local: `openclaw status`; remote: SSH tunnel `ssh -N -L 18789:127.0.0.1:18789 user@host` then open `http://127.0.0.1:18789/`).
- For `AUTH_TOKEN_MISMATCH`, clients may do one trusted retry with a cached device token when the gateway returns retry hints. If auth still fails after that retry, resolve token drift manually.
- For token drift repair steps, follow [Token drift recovery checklist](/cli/devices#token-drift-recovery-checklist).
- Retrieve or supply the token from the gateway host:
- Plaintext config: `openclaw config get gateway.auth.token`
- SecretRef-managed config: resolve the external secret provider or export `OPENCLAW_GATEWAY_TOKEN` in this shell, then rerun `openclaw dashboard`