* feat(imessage): always-on inbound recovery, deprecate catchup
Replaces the opt-in catchup subsystem with always-on inbound replay
protection that brings iMessage in line with the other channels, and
fixes#89237 (stale backlog dispatched as fresh after bridge recovery).
- New inbound-dedupe.ts: persistent claimable GUID dedupe (claim/commit/
release) plus a stale-backlog age fence that suppresses live rows whose
send date is materially older than arrival (logged, never silent).
- monitor-provider: claim at ingestion, carry the exact claimed key on the
debouncer entry, commit on successful flush / release on dispatch failure
(per-unit so a coalesced bucket cannot strand a sibling claim). Keeps the
local startup since_rowid watermark so startup-window rows are not skipped.
- Deprecate catchup: delete catchup.ts + catchup-bridge.ts, remove the
channels.imessage.catchup schema, cursor migration, and config-guard nag.
Back-compat: strip the retired key before validation; new imessage doctor
contract reports + removes it on doctor --fix.
- Docs updated for the new recovery model.
Net -947 prod LOC.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* feat(imessage): recover downtime messages via since_rowid replay
Builds downtime recovery on the new inbound dedupe instead of restoring the
old catchup subsystem. On startup the monitor passes the last dispatched rowid
(a persisted per-account cursor) to imsg watch.subscribe as since_rowid, so imsg
replays the messages that landed while the gateway was down, then tails live.
The GUID dedupe drops anything already handled, so no cursor/retry bookkeeping
is needed.
- recovery-cursor.ts: minimal persisted per-account lastDispatchedRowid.
- monitor-provider: since_rowid = cursor (capped to the most recent
IMESSAGE_RECOVERY_MAX_ROWS); split the age fence on the startup rowid boundary
so replayed rows (<= boundary) use the wider recovery window and live rows
(> boundary) keep the tight #89237 fence; advance the cursor on commit.
- Local only: remote SSH cliPath cannot read chat.db, so it tails from the
current rowid (suppress-and-move-on) as before.
Restores missed-message recovery that the catchup removal dropped, with no
config and a fraction of the old LOC.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): make recovery cursor advance failure- and suppression-safe
Addresses two cursor-state regressions in the downtime-recovery path:
- Failed replay rows could be skipped forever: a released (failed) row keeps
its dedupe claim for retry, but a later successful row in the same flush
advanced the cursor past it, so the next startup's since_rowid skipped it.
Hold a per-session floor at the lowest released rowid and never advance the
cursor past it.
- Suppressed live backlog could be re-delivered after a restart: a live row
suppressed under the tight live fence was not recorded, so after a restart it
fell under the wider recovery window (its rowid now below the new boundary)
and was delivered. Commit its dedupe key on suppression so the recovery
replay treats it as already handled.
Both caught by Codex autoreview. Adds regression tests for the floor and the
suppression record.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): bound the GUID-less replay key length
Hash the composite fallback key's variable parts (conversation, sender,
created_at, text) so the key is length-bounded regardless of message text.
The persistent dedupe store already hashes keys internally, so this was not a
live overflow, but the bounded key removes the dependency on that and keeps the
fallback fail-open. Flagged by autoreview.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): recover downtime messages on remote cliPath setups too
The since_rowid replay runs over the imsg RPC client, so driving it from the
persisted recovery cursor (not the local chat.db boundary) makes downtime
recovery work for remote SSH cliPath gateways — the topology the old RPC-based
catchup served and that the rowid-boundary-only version regressed. Local setups
keep the wider, capped recovery window via the chat.db boundary; remote uses the
live age-fence window. Flagged by autoreview.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): seed recovery cursor from retired catchup cursor on upgrade
A one-time, self-cleaning migration: when the recovery cursor is empty on the
first startup after upgrade, seed it from the retired imessage.catchup-cursors
lastSeenRowid and consume the legacy entry. Without this a user who had catchup
enabled would not replay messages missed across the upgrade restart. Flagged by
autoreview.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix(imessage): preserve catchup recovery on upgrade
---------
Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Default localModelLean runs to compact Tool Search controls when the operator has not configured tools.toolSearch, while preserving explicit Tool Search settings and direct message-tool delivery semantics.
Verification: local focused Vitest/docs/format/lint/diff/autoreview proof; GitHub CI, CodeQL/Security High, CodeQL Critical Quality, OpenGrep PR Diff, Real behavior proof, Dependency Guard, and Workflow Sanity passed on 6153fb5ecb.
Refs https://github.com/openclaw/openclaw/issues/86599
Materializes prompt-visible skills into a protected sandbox-readable workspace for rw sandboxes, refreshes Docker/SSH/OpenShell views, and hardens stale or poisoned remote skill copies. Fixes#90410.
Fix MiniMax-M3 Anthropic-compatible requests so OpenClaw no longer sends the disabled-thinking payload that makes M3 return empty content. M3 defaults now stay on MiniMax's omitted/adaptive thinking path, explicit `/think off` is still respected, and MiniMax-M2.x keeps the disabled-thinking default that prevents reasoning_content leaks.
Also wires the MiniMax thinking policy through bundled provider-policy loading so pre-runtime and configless embedded-agent paths resolve the same defaults.
Thanks @IamVNIE for the live MiniMax API repro and initial patch.
Count streamed text/thinking/tool-call deltas incrementally in model diagnostics instead of repeatedly estimating full event payloads. Updates diagnostics docs and OTEL wording for the new response byte baseline.\n\nVerification: node scripts/run-vitest.mjs run src/agents/embedded-agent-runner/run/attempt.model-diagnostic-events.test.ts; GitHub Actions CI run 27064304709; CodeQL run 27064304710; OpenGrep PR Diff run 27064304716.
Keep startup non-breaking for existing installs when hooks.token reuses Gateway auth, but surface a startup warning, critical security audit finding, and doctor --fix repair that rotates persisted hooks.token.
Closes#87376.
Co-authored-by: Coy Geek <65363919+coygeek@users.noreply.github.com>
Adds optional `gateway.tailscale.serviceName` support for Tailscale Serve so the Gateway Control UI can be exposed through a named Tailscale Service while existing hostname-based Serve and Funnel behavior stays unchanged.
The implementation validates `svc:<dns-label>`, passes the Service name to `tailscale serve`, clears named Service config with `tailscale serve clear <service>` when resetOnExit runs, and uses the derived Service hostname in startup logs, status output, and pairing URLs.
Verification:
- node scripts/run-vitest.mjs src/infra/tailscale.test.ts src/gateway/server-tailscale.test.ts src/config/config.gateway-tailscale-bind.test.ts src/gateway/startup-auth.test.ts src/commands/status.scan.shared.test.ts src/pairing/setup-code.test.ts
- .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main --parallel-tests "node scripts/run-vitest.mjs src/infra/tailscale.test.ts src/gateway/server-tailscale.test.ts src/config/config.gateway-tailscale-bind.test.ts src/gateway/startup-auth.test.ts src/commands/status.scan.shared.test.ts src/pairing/setup-code.test.ts"
- git diff --check
- git merge-tree --write-tree origin/main origin/pr/88691
Closes#88629.
Co-authored-by: Charles OpenClaw <charles-openclaw@9bcfae.inboxapi.ai>
Summary:
- Add forced provider re-login support that clears cached auth profiles before running provider login again.
- Add provider-auth remediation guidance and a session-scoped skip cache for known-bad fallback auth attempts.
- Wire session ids through agent command, auto-reply, and embedded compaction fallback callers so the skip cache applies on real run paths.
- Fail closed when forced auth profile removal cannot update the profile store.
Verification:
- Local format, lint, diff-check, focused Vitest shards, and autoreview passed.
- PR CI, CodeQL Security High, and Critical Quality agent-runtime-boundary passed on head 1b4e9e753e.
Co-authored-by: Mert Basar <MertBasar0@users.noreply.github.com>
Count model stream diagnostic response bytes from snapshotless stream chunks, excluding accumulated partial snapshots on delta events. This avoids repeatedly serializing answer-so-far snapshots during streamed model calls and updates OTEL/docs wording for the new metric baseline.
Refs #86599.
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
Enforce OpenAI-compatible `tool_choice` contracts for Gateway HTTP Chat Completions and Responses client function tools.
- Add shared request normalization and post-run enforcement for required and pinned client function tool choices.
- Buffer streaming output until the tool-choice contract is satisfied, so failed runs do not leak partial assistant prose.
- Document the client-function-tool scope and add regression coverage for Chat/Responses success and failure cases.
Thanks @Lellansin for the contribution.
Proof: exact-head CI passed for `79fa0947360d307cf4ecffe713489cdf5db61093` in run `26714604449`; focused gateway tests passed locally.
Deliver plugin-owned bound-thread replies even when the source room is configured for `message_tool` visible replies. Normal agent final text still stays private unless the agent calls `message(action=send)`.
Document the distinction in the group/channel docs and root routing policy, and keep ambient room-event plus unauthorized text-slash suppression covered by regression tests.
Fixes#87721.
Add a bounded `chat.message.get` gateway method so Control UI can fetch one display-normalized transcript message by id when an assistant history preview was truncated. Keep `chat.history` lightweight, reject oversized/hidden/missing rows with explicit unavailable reasons, and wire the WebChat side reader to request full content only for visible truncated assistant messages.
Also refresh the generated Swift gateway protocol models and document the new assistant-message side-reader behavior.
Closes#84651.
Related #53242.
Co-authored-by: NianJiuZst <3235467914@qq.com>
Add MCP server add/configure/login/reload flows plus config/runtime support for enablement, filters, timeouts, OAuth, TLS, and parallel execution hints. Update docs and tests for the expanded MCP operator surface.
Fixes#63558.
Adds a Dreaming-tab agent selector and propagates the selected agent through Dreaming status, diary, and diary actions while preserving default-agent fallback when agentId is omitted. Also keeps report Memory Palace cards in the Control UI wiki-preview flow and documents the optional Dreaming agentId gateway parameters.
Verification:
- GitHub CI run 26693682975 passed on 43a2b17243.
- CodeQL Critical Quality run 26693682971 passed.
- CodeQL / Security High run 26693682957 passed.
- Workflow Sanity run 26693682949 passed.
- OpenGrep PR Diff run 26693682947 passed.
- Dependency Guard run 26693682003 passed.
- Real behavior proof run 26693860539 passed.
- git diff --check origin/main...refs/remotes/origin/pr/78748 passed.
- git merge-tree --write-tree origin/main refs/remotes/origin/pr/78748 passed.
Thanks @stevenepalmer.
Co-authored-by: Steven Palmer <6134396+stevenepalmer@users.noreply.github.com>
Forward OpenAI-compatible stop sequences from gateway chat completions through the agent runner into provider transports.
The gateway now normalizes stop into sampling extras, agent transports pass it into the shared stream options, and OpenAI, Anthropic, Mistral, Google, and Vertex-backed simple providers map it to their native request fields. Provider/gateway/agent coverage plus Crabbox live gateway proof verify valid stop dispatch and invalid stop rejection.
Refs #87920
Fixes#88198.
Ignore top-level helper scripts in auto-discovered global/workspace extension roots so they do not become manifestless plugin candidates during config validation. Standalone plugin files remain supported when explicitly configured through `plugins.load.paths`, and docs now call out the supported path.
Verification:
- `node scripts/run-vitest.mjs src/plugins/discovery.test.ts src/config/config.plugin-validation.test.ts`
- `node scripts/run-oxlint.mjs src/plugins/discovery.ts src/plugins/discovery.test.ts src/config/config.plugin-validation.test.ts`
- `git diff --check`
- GitHub CI green at `93073bfa85ee294e644c623881ba59ba71d90975`
- `.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main`
Thanks @mushuiyu886 for the fix and @mmhzlrj for the report.
Refactor OpenAI provider identity so OpenAI remains the canonical provider for API-key and OAuth-backed flows while legacy openai-codex state is doctor/migration-only.
Keeps OpenAI Codex Responses as an API/transport class rather than a provider identity, moves auth aliases through providerAuthAliases, updates doctor repair sequencing for old auth/profile state, and refreshes tests/docs around the canonical OpenAI behavior.