Commit Graph

12243 Commits

Author SHA1 Message Date
amittell
f7c32fc8be fix(telegram): lower polling keepalive delay (#83304)
* fix(telegram): enable TCP keepalive on getUpdates connections to prevent NAT timeout stalls

Long-polling connections to api.telegram.org stay idle for up to the
getUpdates timeout (~900 s). Most home/office NAT tables expire idle TCP
entries after 60–1800 s (commonly ~1000 s). When the NAT entry is
silently dropped the connection hangs rather than returning an error,
leaving the grammY runner stuck until the 90 s stall watchdog fires and
forces a restart cycle.

Fix: unconditionally set `keepAlive: true` and
`keepAliveInitialDelay: 30_000` (30 s) on the undici Agent `connect`
options built in `buildTelegramConnectOptions`. OS-level TCP keepalive
probes sent every ~75 s (OS default) will:
1. Refresh the NAT table entry before it expires.
2. Surface dead connections immediately with ETIMEDOUT instead of
   hanging forever.

The `return Object.keys(connect).length > 0 ? connect : null` guard is
also removed; `connect` is now always non-empty so it always returns the
object.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
(cherry picked from commit 92e454c0614256201cdf6f0f73c7897d006616d4)

* fix(telegram): stop self-flagging disconnected on poll-cycle start; widen channel connect grace to 300s

(cherry picked from commit 1ca963a05dac0d9d605e9a15dc97fced9cf7725e)

* fix(telegram): catch hung polling startups that preserve inherited connected:true

The widened 300s channel connect grace and the removal of connected:false from
notePollingStart left a path where a polling restart could hang forever
looking healthy. notePollingStart clears lastConnectedAt, lastEventAt, and
lastTransportActivityAt but deliberately omits connected, so server-channels'
patch-merge inherits a connected:true from the previous lifecycle. After grace,
evaluateChannelHealth's stale-socket branch requires lastTransportActivityAt
to be non-null and the connected:false branch is masked, so the channel sits
healthy with no first getUpdates.

Add a post-grace branch to evaluateChannelHealth that flags polling channels
as stale-socket when connected:true is paired with null lastConnectedAt and
null lastTransportActivityAt and a non-null lastStartAt. Scoped to mode:polling
so webhook channels and channels without continuous transport tracking are
not falsely flagged. Align TELEGRAM_POLLING_CONNECT_GRACE_MS in the Telegram
status diagnostic with DEFAULT_CHANNEL_CONNECT_GRACE_MS so openclaw channels
status agrees with the shared health monitor on the grace window. Refresh
the notePollingStart comment to point at the new evaluateChannelHealth branch.

Addresses clawsweeper review on #83304 (P1 connect-grace startup-hang, P2
diagnostic grace drift). Tests cover the new flagged path, the in-grace happy
path, and the prior-successful-connect happy path.

* fix(telegram): clear polling connected state on startup

* fix(gateway): add defense-in-depth health-policy branch for hung polling startups

Defense in depth on top of 87db46c576's notePollingStart connected:false fix.
The primary path (notePollingStart writes connected:false explicitly so
evaluateChannelHealth's existing connected===false branch catches a hung
restart) is unchanged. This adds a defensive post-grace branch that catches
the same hang via a different signature -- inherited connected:true paired
with null lastConnectedAt and null lastTransportActivityAt -- in case a
future code path forgets to clear the inherited connected flag on lifecycle
start. Scoped to mode:polling so webhook channels and channels without
continuous transport tracking are not falsely flagged.

Also bump lastStartAt: Date.now() - 121_000 to 301_000 in the spool-handler
timeout test added by upstream #83505 so it falls past the widened 300s
TELEGRAM_POLLING_CONNECT_GRACE_MS suppression window (mirroring the same
fixup already applied to the two adjacent polling-startup tests).

* revert(telegram,gateway): keep connect grace at 120s

Drop the 120s -> 300s widening from this PR after maintainer feedback that
the extra grace masks real startup bugs. The defense-in-depth checks added
in earlier commits (notePollingStart clearing inherited connected state,
the stale-socket policy branch, the per-snapshot startup grace test) all
work fine at 120s and remain valuable on their own.

Reverts in:
- src/gateway/channel-health-policy.ts: DEFAULT_CHANNEL_CONNECT_GRACE_MS 300 -> 120
- extensions/telegram/src/status-issues.ts: TELEGRAM_POLLING_CONNECT_GRACE_MS 300 -> 120
- extensions/telegram/src/status.test.ts: lastStartAt 301_000 -> 121_000 (3 cases)

The new channel-health-policy.test.ts cases use explicit channelConnectGraceMs:
10_000 in the policy, so they are unaffected by the default constant change.

* fix(telegram): narrow polling keepalive fix

---------

Co-authored-by: Yibei Ou <yibeiou@Yibeis-Mac-mini.local>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-05-28 10:13:13 +05:30
Masato Hoshino
313d6ae1b3 fix(whatsapp): strip control characters from outbound document fileName (#77114)
Merged via squash.

Prepared head SHA: 5417a8ee2c
Co-authored-by: masatohoshino <246810661+masatohoshino@users.noreply.github.com>
Co-authored-by: mcaxtr <7562095+mcaxtr@users.noreply.github.com>
Reviewed-by: @mcaxtr
2026-05-28 01:17:52 -03:00
Dallin Romney
8d21ac3f6e refactor: share QA runtime helpers (#87412)
* refactor: share QA runtime helpers

* refactor: keep QA helpers private

* refactor: keep QA helpers on private runtime seam

* chore: prune stale QA duplicate ignores

* fix: align qa runtime boundary alias

* fix: avoid startup memory lint conversion
2026-05-27 21:16:24 -07:00
Agustin Rivera
b860a0d4d0 fix: harden qqbot direct media uploads
Harden QQBot direct media URL uploads by downloading through the local SSRF guard before QQ upload, disabling redirects, bounding fetch/setup and body reads, and routing downloaded buffers through the existing one-shot/chunked size gate.

Co-authored-by: Agustin Rivera <agustin@rivera-web.com>
2026-05-28 04:21:46 +01:00
Dallin Romney
5887119e8d chore: stop tracking generated diffs viewer runtime (#87405)
* chore: stop tracking generated diffs viewer runtime

* test(diffs): generate viewer runtime fixture when missing
2026-05-27 19:59:35 -07:00
bladin
e0d003b372 fix(whatsapp): support pluginHooks.messageReceived in channel/account config schema (#86426)
Merged via squash.

Prepared head SHA: 27003a8d5a
Co-authored-by: bladin <1740879+bladin@users.noreply.github.com>
Co-authored-by: mcaxtr <7562095+mcaxtr@users.noreply.github.com>
Reviewed-by: @mcaxtr
2026-05-27 23:31:47 -03:00
Peter Steinberger
2229122077 fix: keep private SDK declarations local 2026-05-28 03:28:27 +01:00
Dallin Romney
3005b62242 perf(plugins) refactor plugin SDK declarations for flat package types (#87165)
* refactor: flatten plugin sdk declarations

* fix: align package inventory with flat sdk declarations

* refactor: move packed sdk smoke to fixture

* test: simplify packed sdk type smoke

* fix(canvas): use focused number runtime helpers

* fix(ci): stabilize sdk boundary checks

* test: guard private sdk declaration leaks

Co-authored-by: Peter Steinberger <steipete@gmail.com>

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-27 19:22:32 -07:00
Vincent Koc
b6e354f6ca fix(file-transfer): handle late tar pipe errors 2026-05-28 04:14:57 +02:00
Andy
d2319d718c fix(status): keep default JSON scan lean
Default `openclaw status --json` stays on the lean health-probe path while preserving the JSON task summary, local update/install metadata, explicit probe timeouts, and configured gateway handshake timeouts. Deeper memory, registry, remote git, and local status-RPC diagnostics remain behind `status --json --all`.

Also keeps generated diffs viewer output in its built form and ignores it in oxfmt so `pnpm build` leaves a clean tree.

Proof:
- `node scripts/run-vitest.mjs src/commands/status.scan.fast-json.test.ts src/commands/status-json-payload.test.ts src/commands/status.scan.shared.test.ts`
- `OPENCLAW_LOCAL_CHECK=0 node scripts/run-oxlint-shards.mjs --threads=8`
- `node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo`
- `node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.extensions.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/extensions-test.tsbuildinfo`
- `.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main`
- GitHub checks green for head `47a63f87ea7c2351994fdb71e8cc18041aa0b64e`

Thanks @andyylin.

Co-authored-by: Andy <andyylin@users.noreply.github.com>
2026-05-28 02:28:49 +01:00
Vincent Koc
5846878924 fix(auth): honor OAuth login cancellation 2026-05-28 03:12:40 +02:00
Peter Steinberger
cee2a50fe6 chore(release): prepare 2026.5.28 2026-05-28 01:48:07 +01:00
Peter Steinberger
0e262d20e7 fix(discord): fence tool warning fallback delivery (#87465)
* fix(discord): fence recovered tool warning fallback

* fix(discord): keep warning fallback after failed final

* fix(reply): keep settled cleanup unconditional
2026-05-28 01:39:14 +01:00
Peter Steinberger
45e6af5e57 fix: reject partial numeric runtime values 2026-05-27 20:10:01 -04:00
Peter Steinberger
d1aa3cb925 fix: reject partial numeric command values 2026-05-27 20:10:01 -04:00
Peter Steinberger
c86667c5cf test(discord): use reply payload SDK test helper (#87454)
* test(discord): use reply payload SDK test helper

* build(plugin-sdk): exclude reply payload test helper
2026-05-28 00:57:22 +01:00
Peter Steinberger
da279041ab fix(discord): suppress recovered tool warnings (#87451) 2026-05-28 00:32:28 +01:00
Fermin Quant
3f9d2415ac fix(cron): stabilize isolated prompt cache affinity
Stabilize isolated cron prompt cache affinity by deriving a stable prompt cache key per cron job/session/model and forwarding it separately from the rotating run session id.

Thread the key through embedded runs, stream resolution, provider options, proxy forwarding, custom streams, and prompt-cache observability. Keep OpenAI-compatible payloads valid by using hyphen-safe keys, clamping upstream prompt_cache_key values, and omitting affinity when cache retention is disabled.

Thanks @ferminquant.

Co-authored-by: Fermin Quant <ferminquant@hotmail.com>
2026-05-28 00:31:19 +01:00
lukeboyett
b5bd6e8828 fix(sessions): preserve Matrix room-id case in session keys (#75670) (#87366)
* fix(sessions): preserve Matrix room-id case in session keys (#75670)

Matrix room IDs (and thread event IDs) are opaque, case-sensitive per the
Matrix spec, but session-key canonicalization lowercased them. That forked
one room into duplicate sessions and produced 403 M_FORBIDDEN on recovery /
delivery paths that reconstruct the target from the (lowercased) session key,
even though deliveryContext.to stayed correct.

Introduce a generic, opt-in case-preservation registry (CASE_PRESERVING_PEERS)
consulted at all three lowercasing sites:
- construction: normalizeSessionPeerId
- store canonicalization: normalizeSessionKeyPreservingOpaquePeerIds
- gateway send: explicit request.sessionKey

Signal group preservation is encoded to match prior behavior exactly (segment
span, unscoped, thread suffix still lowercased). Matrix channel/group enrolls
the opaque tail (room id with embedded :server + any 🧵<event> suffix).
Exact mixed-case keys now win over folded legacy aliases in
resolveSessionStoreEntry and delivery-info lookup; existing lowercased rows
collapse on the next write. Matrix DM/MXID and non-enrolled channels keep the
default lowercase behavior.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(sessions): guard Matrix folded alias delivery proof

* test(agents): cover cold OpenAI gpt-5.5 fallback

* fix(sessions): preserve non-opaque alias freshness

* fix(sessions): prevent Matrix cross-room thread recovery

* build(protocol): refresh tools effective Swift models

* test(codex): include effective cwd in startup fixture

* test(codex): align startup failure cleanup expectation

* fix(sessions): keep Signal folded aliases fresh

* fix(sessions): preserve unscoped Matrix room keys

* fix(sessions): recover legacy Matrix thread aliases

* fix(sessions): preserve Matrix keys in state migrations

* fix(sessions): keep Matrix structural alias freshness

* fix(sessions): preserve unscoped Matrix migration keys

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-28 00:26:49 +01:00
Peter Steinberger
59c3ee7c45 fix(imessage): continue polling after denied reactions
(cherry picked from commit 6cc534af9b859301f9ff814bdd8672fa910141e3)
2026-05-28 00:17:52 +01:00
Vincent Koc
da5fe990d8 fix(codex): report quarantined dynamic tools 2026-05-28 00:56:30 +02:00
Kevin Lin
40bca6d8bb fix(imessage): suppress duplicate native exec approvals
Fix iMessage native exec approval routing so approval prompts bind to the sent GUID without duplicate sends after RPC timeout. Also keeps chat.db GUID recovery on the local imsg path while avoiding local DB recovery for configured or detected SSH wrappers.

Thanks @kevinslin.
2026-05-27 23:55:28 +01:00
Sarah Fortune
6ac3561c69 fix(codex): format skills command output (#87400) 2026-05-27 15:43:05 -07:00
samzong
316fd5b625 [Fix] Warm provider auth off main thread (#86281)
* fix(agents): warm provider auth off main thread

Signed-off-by: samzong <samzong.lu@gmail.com>

* fix(agents): keep provider auth warm read-only

* fix(ci): unblock provider auth landing

* ci: serialize gateway watch artifact check

* fix(ci): stabilize diffs viewer asset generation

* fix(agents): avoid stale plugin auth warm results

* fix(agents): keep partial auth warm cache

---------

Signed-off-by: samzong <samzong.lu@gmail.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-27 23:24:55 +01:00
Peter Steinberger
5d437de70f fix(web-search): preserve runtime-only provider config
Fixes #87191. Keeps Brave and Gemini runtime-injected web search provider config readable by providers without re-exposing legacy tools.web.search provider objects to config validation.
2026-05-27 23:17:07 +01:00
xiaotian
fb1dfd486b fix(slack): retain delivered final replies during late cleanup
Fix Slack draft cleanup after final-visible delivery.

Track when Slack has already delivered a visible final reply and stop reusing the draft finalizer for later same-turn final/error payloads. This keeps the first fallback cleanup for transient previews while preventing late cleanup from deleting a visible answer.

Fixes #87363

Co-authored-by: tianxiaochannel-oss88 <tianxiaochannel@gmail.com>
2026-05-27 23:16:17 +01:00
Peter Steinberger
cf47580a45 test(ci): align startup and model fixtures 2026-05-27 18:09:03 -04:00
Peter Steinberger
f24844d801 fix: reject partial numeric parsing 2026-05-27 18:00:19 -04:00
Mariano
7299c56953 Fix sub-agent cwd/workspace separation (#87218)
Merged via squash.

Prepared head SHA: f47b073830
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Co-authored-by: mbelinky <132747814+mbelinky@users.noreply.github.com>
Reviewed-by: @mbelinky
2026-05-27 23:55:24 +02:00
Kevin Lin
bb752c2b47 Revert "feat: expose plugin approval action metadata" (#87419)
This reverts commit 0c867eef75.

# Conflicts:
#	docs/.generated/plugin-sdk-api-baseline.sha256
2026-05-27 14:48:06 -07:00
Vincent Koc
5ad8036bda test(openai): stabilize live audio transcription 2026-05-27 23:45:35 +02:00
Vincent Koc
67277088eb fix(oauth): bound Codex token requests 2026-05-27 23:20:15 +02:00
Vincent Koc
09d2682cd8 fix(openai): resolve gpt-5.5 without cached catalog 2026-05-27 22:57:30 +02:00
Vincent Koc
00004ca798 fix(cli): wait for respawn child shutdown 2026-05-27 22:57:30 +02:00
Peter Steinberger
7f7eca1ad2 fix(codex): preserve shared app-server after startup app errors (#87399)
* fix(codex): preserve shared app-server after startup app errors

* fix(codex): align startup cleanup tests with current types

* test(config): isolate installed plugin ledger cache
2026-05-27 21:50:41 +01:00
GarlicGo
2900c1c25c fix(inbound-meta): include seconds in timestamps
Include second-level precision in inbound metadata and auto-reply envelope timestamps, matching the timestamp helper contract used by providers and channel adapters.

Docs now show the weekday plus seconds form in date-time and timezone examples.

Verification:
- node scripts/run-vitest.mjs src/auto-reply/envelope.test.ts src/auto-reply/reply/inbound-meta.test.ts
- pnpm docs:list >/tmp/openclaw-docs-list-87360.log
- git diff --check origin/main...HEAD
- pnpm format:docs:check
- pnpm lint:docs
- pnpm lint:extensions:bundled
- pnpm lint
- PR CI green on 495bb6c10f

Fixes #87257

Co-authored-by: GarlicGo <582149912@qq.com>
2026-05-27 21:18:08 +01:00
Alix-007
f4329fe0d6 fix(agents): bound plugin system context
* fix(agents): bound plugin system context

* test(agents): align wrapped system context expectations

* style(agents): format hook context helper

* test(codex): expect plugin system context boundary

---------

Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-27 21:16:15 +01:00
Peter Steinberger
d30ba7f961 fix(ci): satisfy codex extension lint 2026-05-27 16:05:06 -04:00
Dallin Romney
cc2948d1e1 fix(codex): narrow legacy hook generation grace (#87386) 2026-05-27 13:01:44 -07:00
Peter Steinberger
c71c49c460 fix(ci): address lint and test type failures 2026-05-27 15:56:12 -04:00
Peter Steinberger
d93524d1cc fix(codex): route workspace memory through tools (#87383)
* fix(codex): route workspace memory through tools

* fix(codex): preserve extra memory bootstrap files

* fix(codex): support memory_get-only context routing

* fix(codex): only tool-route canonical workspace memory

* fix(codex): keep memory fallback for sandbox workspaces
2026-05-27 20:55:27 +01:00
Yuval Dinodia
74f9d6b96d fix(codex): preserve shared app-server when spawned helper run fails logically (#72574) (#87375)
* fix(codex): preserve shared app-server when spawned helper run fails logically

* fix(codex): widen spawnedBy param to match EmbeddedRunAttemptParams

* fix(codex): align spawnedBy startup typing

* fix(codex): retire shared client on spawned startup timeout

* fix(codex): narrow spawned thread-start preservation

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-27 20:48:18 +01:00
Peter Steinberger
94749b0a45 fix(cli): reject malformed numeric inputs 2026-05-27 15:43:12 -04:00
keshavbotagent
e339586750 fix(plugin-state): evict current namespace on plugin row cap
Make plugin-state enforce the plugin-wide live-row fuse by evicting only from the namespace currently being written, preserving sibling namespace rows and still failing atomically when the current namespace cannot free enough rows.

Raise the plugin-wide cap to 6,000 rows, keep Telegram's persistent message-cache namespace at 3,000 entries, and document the updated SDK runtime contract. Harden legacy plugin-state import so capacity pressure cannot archive a source after losing imported keys, with focused regression coverage for Telegram-shaped namespaces and migration rollback.

Also restore the Docker runtime-assets preflight step in full release validation so release workflow contract tests stay aligned.

Verification: focused plugin-state, migration, Telegram, workflow-contract, lint, deprecated-API, diff-check, Blacksmith Testbox, CI, CodeQL, Workflow Sanity, OpenGrep, and autoreview all passed on PR head fee021cfa6.

Co-authored-by: Keshav's Bot <keshavbotagent@gmail.com>
2026-05-27 20:33:40 +01:00
Shubhankar Tripathy
90f30075aa fix(channels): preserve Telegram SecretRef prompt config
Use read-only Telegram account inspection for prompt-time channel actions, inline buttons, and reaction guidance so unresolved SecretRef tokens retain configured non-secret behavior before runtime snapshot hydration.

Match runtime Telegram account lookup for normalized config keys and multi-account fallback guards, while keeping sends/actions on the existing strict credential resolution path.

Fixes #75433.

Co-authored-by: Shubhankar Tripathy <reach2shubhankar@gmail.com>
2026-05-27 20:25:41 +01:00
Peter Steinberger
2f710f5604 fix(ci): avoid deprecated sdk import in canvas cli 2026-05-27 14:57:00 -04:00
Alex Knight
42e9504114 fix(codex): preserve native hook relay across restarts
Fixes #87331.\n\nPersist Codex native hook relay generations for real app-server resumes, keep a bounded legacy-binding grace path, and rotate generation on fresh-thread fallback so stale hook commands stay rejected.\n\nCo-authored-by: Alex Knight <15041791+amknight@users.noreply.github.com>
2026-05-27 19:55:19 +01:00
Vincent Koc
fdbf3cf4e7 fix(qa): make matrix block streaming deterministic 2026-05-27 20:43:33 +02:00
Peter Steinberger
163df2578b fix(diffs): use root viewer runtime builder 2026-05-27 14:36:07 -04:00
Peter Steinberger
0f5ea87244 fix(cli): reject partial numeric options 2026-05-27 14:36:06 -04:00