Commit Graph

31652 Commits

Author SHA1 Message Date
Peter Steinberger
cd45f53b4e test: stabilize parallels npm update smoke 2026-04-16 21:52:10 +01:00
Vincent Koc
c9103c2e47 fix(ci): prepare plugin sdk dts before lint 2026-04-16 13:50:23 -07:00
Vincent Koc
f835da1667 fix(ci): trim slow task and gateway paths 2026-04-16 13:34:34 -07:00
Gustavo Madeira Santana
56a9fd4b34 QA Matrix: capture full runner output 2026-04-16 16:18:54 -04:00
Gustavo Madeira Santana
988447ca24 QA Matrix: expand contract coverage 2026-04-16 16:18:54 -04:00
Gustavo Madeira Santana
0f7c40e508 Matrix: expose E2EE QA verification hooks 2026-04-16 16:18:54 -04:00
Gustavo Madeira Santana
21d500a65f test: expose bundled plugin QA test APIs 2026-04-16 16:18:54 -04:00
Gustavo Madeira Santana
5bb180061a Check: run type and lint earlier 2026-04-16 16:18:54 -04:00
Peter Steinberger
372c0051ba test: speed up slow import-boundary tests 2026-04-16 21:14:17 +01:00
Devin Robison
8b7d76bfbb fix(compaction): stop retaining credential-like values (#67801) 2026-04-16 14:04:45 -06:00
Peter Steinberger
894e728fd0 test: relax Parallels install smoke timeout 2026-04-16 12:54:13 -07:00
Peter Steinberger
5262757f9a fix: handle Codex HTML challenge errors (#67704) (thanks @chris-yyau) 2026-04-16 12:47:12 -07:00
Chris Yau
59caf03d67 Avoid rescanning HTML challenge pages during error formatting
The HTML challenge fix already keeps standalone CDN block pages out of the DNS transport path. This follow-up caches the HTML classification so status-prefixed non-HTML failures do not pay for the same scan twice and the control flow stays simpler.

Constraint: Keep behavior identical for both status-prefixed HTML pages and standalone HTML challenge pages
Rejected: Inline the helper into the status branch only | would duplicate the standalone HTML branch logic
Confidence: high
Scope-risk: narrow
Directive: If this formatter grows more branches, keep a single HTML classification result and reuse it through the decision tree
Tested: oxfmt --check src/shared/assistant-error-format.ts
Tested: node scripts/test-projects.mjs src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts
2026-04-16 12:47:12 -07:00
Chris Yau
36dd58ac2a Prevent Codex HTML challenge pages from looking like DNS failures
Cloudflare challenge pages from chatgpt.com/backend-api can arrive as raw HTML without an HTTP status prefix. The transport sanitizer scanned for generic "dns" substrings before HTML detection, so these pages could surface as DNS lookup failures instead of the existing HTML/CDN block message.

Constraint: Must preserve DNS transport classification for real ENOTFOUND/getaddrinfo failures
Rejected: Treat every bare HTML document as an upstream HTML error | too broad for arbitrary model text/errors
Confidence: high
Scope-risk: narrow
Directive: Keep standalone HTML challenge detection ahead of generic transport keyword matching so CDN block pages do not regress into DNS copy
Tested: oxfmt --check on changed files; targeted node --import tsx verification for standalone Cloudflare HTML classification and DNS control case
Not-tested: Full Vitest shard run in this environment
2026-04-16 12:47:12 -07:00
Onur
51606e9889 CI: fix release-check caller permissions (#67787)
* CI: fix release-check caller permissions

* CI: fix scheduled live and e2e checks

* CI: tighten release workflow permissions

* CI: restore release workflow caller permissions

* Actions: harden release check inputs
2026-04-16 21:41:21 +02:00
Vincent Koc
781b1de921 fix(ci): cap core shard checkout stalls 2026-04-16 12:35:38 -07:00
Vincent Koc
2285429aa2 test(vitest): cut unit-ui startup overhead 2026-04-16 12:16:21 -07:00
Peter Steinberger
29427fefc7 ci: make mlx audio manifest patch writable v2026.4.15-beta.2 2026-04-16 20:12:18 +01:00
Peter Steinberger
15c7f478da docs: update plugin sdk api baseline 2026-04-16 19:58:08 +01:00
Peter Steinberger
006a8aeb8c ci: register blacksmith macos runner labels 2026-04-16 19:58:08 +01:00
Peter Steinberger
ad9da24317 test: keep web search config imports stable 2026-04-16 19:58:08 +01:00
Peter Steinberger
c635efd233 chore: prepare 2026.4.15-beta.2 release 2026-04-16 19:58:08 +01:00
Josh Lehman
a327b6750d fix: stabilize context engine prompt cache touches (#67767)
* fix: stabilize context engine prompt cache touches

* fix(changelog): document context-engine prompt cache touch stabilization
2026-04-16 11:53:42 -07:00
Vincent Koc
ac717a92e8 test(gateway): avoid mapped hook provenance event race 2026-04-16 11:35:14 -07:00
Vincent Koc
c2db918c60 fix(ci): silence mlx-audio-swift README warnings 2026-04-16 11:27:32 -07:00
Vincent Koc
42d100c390 fix(ci): move macOS jobs to blacksmith 2026-04-16 11:18:50 -07:00
zqchris
82e349a48a memory: strip inbound metadata envelopes from user messages in session corpus (#66548)
Merged via squash.

Prepared head SHA: 98562b2a84
Co-authored-by: zqchris <4436110+zqchris@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-04-16 11:15:44 -07:00
Vincent Koc
00d21d1b23 fix(ci): retry stalled core shard checkout 2026-04-16 11:04:16 -07:00
Onur
900e291f31 CI: expand native release validation coverage (#67144)
* Actions: grant reusable release checks actions read

* Actions: use read-all for reusable release checks

* CI: add native cross-OS release checks

* CI: wire Discord smoke secrets for cross-OS checks

* CI: fix native cross-OS installer compatibility

* CI: skip empty pnpm cache saves in matrix jobs

* CI: honor workflow runner override envs

* CI: finish native cross-OS update checks

* CI: fix native cross-OS workflow regressions

* Installer: capture Windows npm stderr safely

* CI: harden cross-OS release checks

* CI: resolve reusable workflow harness ref

* CI: stabilize cross-OS dev update lanes

* CI: tighten release-check workflow semantics

* CI: repoint repaired git CLI on POSIX

* CI: repair native dev-update shell handoff

* CI: preserve real updater semantics

* CI: harden supported release-check refs

* CI: harden release-check refs and fresh mode

* CI: skip dev-update for immutable tag refs

* CI: repair fresh installer release checks

* CI: fix native release check installer lanes

* CI: install release checks from candidate artifacts

* CI: use Windows cmd shims in release checks

* Installer: run Windows npm shim via PowerShell

* CI: pin dev update verification to candidate sha

* CI: pin reusable harness and published installers

* CI: isolate Windows dev-update PATH validation

* CI: align Windows dev-update bootstrap validation

* CI: avoid Windows installer gateway flake

* CI: run cross-OS release checks via TypeScript

* CI: bootstrap tsx for release-check workflow

* CI: fix native release-check follow-ups

* CI: tighten dev-update release checks

* CI: peel annotated workflow refs

* CI: harden native release checks

* CI: fix release-check verifier drift

* CI: fix release-check workflow drift

* CI: fix release-check ref resolution

* CI: harden Windows release-check gateway startup

* CI: fix release-check fallback validation

* CI: harden cross-os release checks

* CI: pin dev-update release checks to candidate SHA

* CI: resolve remote dev target refs

* CI: detect cloned dev-update checkouts

* CI: harden Windows release-check launcher

* Windows: harden task fallback and runner overrides

* Release checks: preserve Windows PATH and baseline version reads

* CI: add release validation live lanes

* CI: expand live and e2e release coverage

* CI: add branch dispatch for live and e2e checks
2026-04-16 19:58:19 +02:00
Daniel Salmerón Amselem
687ede50a5 fix(agents): add prompt cache compatibility opt-out
Add compat.supportsPromptCacheKey for OpenAI Responses prompt_cache_key handling, update generated config baseline, changelog, and A2UI dependency-layout test compatibility.
2026-04-16 10:48:51 -07:00
Viz
f624b1d246 fix(security): 7 P1 hardening fixes — scan-paths, windows-acl, audit-extra (#67003)
* test(security): add coverage tests before security fixes

- scan-paths.ts: 100% line coverage (new test file, previously zero)
- windows-acl.ts: 100% line coverage (SID bypass, whoami throw, no-user null return)
- external-content.ts: 99% (line 248 defensive overlap guard, unreachable)
- skill-scanner.ts: 93% (lines 293-294/330/571 are defensive guards for
  future extensibility, unreachable with current rules/patterns)

200+ tests covering TOCTOU paths, cache invalidation, forced-file escapes,
dir-entry-cache hit, SID world-bypass, diacritic-strip fallback,
fullwidth homoglyph markers, and more.

* fix(security): 5 security hardening fixes in src/security/

scan-paths: default requireRealpath to false (safe). All production callers
already pass requireRealpath: true; default callers are now secure.

windows-acl: block world-equivalent SIDs (S-1-1-0 Everyone etc.) from being
added to trusted set via USERSID env var.

windows-acl: log resolveCurrentUserSid failures instead of bare catch{}.

audit-extra: wrap JSON.parse in readPluginManifestExtensions with try-catch.
Malformed package.json returns [] instead of crashing the audit.

audit-extra: depth guard in listWorkspaceSkillMarkdownFiles to prevent
resource exhaustion from deep symlink cycles.

audit-extra: 2s timeout on fs.realpath in collectWorkspaceSkillSymlinkEscapeFindings
to protect against hanging on slow/network filesystems.

audit-extra: warn about phantom entries in plugins.allow that don't match
any installed plugin (pre-approval exploitation vector).

media-understanding/types: add allowPrivateNetwork to transport overrides
(duplicate of PR #66967, required for tsgo to pass here).

* fix(security): address security review findings in audit-extra.async.ts

Issue 1 — Symlink escape audit bypass on realpath timeout:
When realpathWithTimeout returns null (timeout or failure), the previous code
called 'continue', silently skipping the escape check. An attacker with a
symlink to a slow/network filesystem could hang realpath to prevent escape
detection. Now treats unverifiable symlinks as potential escapes and includes
them in the finding.

Issue 2 — Malformed package.json hides extension entrypoints from deep scan:
readPluginManifestExtensions previously swallowed JSON.parse errors and
returned [], which a malicious plugin could exploit by crafting a malformed
package.json to hide its openclaw.extensions entrypoints from the deep code
scanner. Now re-throws the parse error (with cause) so the caller in
collectPluginsCodeSafetyFindings can surface a warn finding and alert the
user, while still scanning the plugin directory via getCodeSafetySummary.

* fix(security): address PR review findings (P1 + P2)

P1 — BFS realpath in listWorkspaceSkillMarkdownFiles lacks timeout:
Extract realpathWithTimeout to module scope so the BFS dequeue loop
uses the same 2 s guard as the outer escape-detection callers. Previously
only the per-workspace and per-skill-file realpaths had the timeout;
a hanging NFS/SMB directory entry inside the BFS could still block
indefinitely.

P1 (acknowledged limitation) — Promise.race leaves the underlying
fs.realpath call running after timeout. fs.realpath cannot be cancelled
once submitted to libuv. Callers are sequential (one await at a time),
so at most one worker thread is occupied; the OS will eventually time
out the stuck call. This is documented in the module-level JSDoc.

P2 — Phantom allowlist check incorrectly flags bundled plugin IDs:
listChannelPlugins() returns bundled channel plugin IDs (telegram,
discord, browser, etc.) that are never in stateDir/extensions.
Add bundledPluginIds exclusion so the phantom-entry finding is scoped
to user-installed extension IDs only.

P2 — Rename MAX_SYMLINK_DEPTH / depthGuard to MAX_TOTAL_DIR_VISITS /
totalDirVisits to accurately reflect that the guard caps total BFS
iterations (2_000 * 20 = 40_000), not per-path symlink depth.

* fix(security): clean up realpathWithTimeout timer and add regression tests

- Clear the timer handle when fs.realpath resolves before the deadline,
  preventing timer accumulation during large audit runs with many files.
- Add .unref() on the timer so it cannot hold the process alive while
  waiting on a potentially hanging NFS/SMB path.

Regression tests added for three audit-extra.async security fixes:
- manifest parse error: malformed plugin package.json surfaces
  plugins.code_safety.manifest_parse_error (audit-extra.async.test.ts)
- phantom allowlist with bundled exclusion: bundled channel plugin IDs
  are excluded from plugins.allow_phantom_entries warnings; non-installed
  non-bundled IDs are correctly reported (audit-plugins-phantom.test.ts)
- unverifiable realpath escape: fs.realpath failure / timeout produces a
  skills.workspace.symlink_escape finding with 'realpath timed out' in
  the detail (audit-workspace-skill-escape.test.ts)

* chore(security): add TODO for structured logger in windows-acl resolveCurrentUserSid

console.warn is acceptable short-term but may be noisy on constrained
Windows hosts; note the follow-up in-code so it is not lost.

* chore: drop unrelated formatting churn from security PR

Restores extensions/memory-lancedb/config.ts and
src/agents/pi-embedded-helpers/errors.ts to their origin/main state.
These were line-wrap-only formatting changes with no relation to the
security fixes in this branch.

* fix(security): address Codex P2 review findings

1. Normalize plugins.allow entries through normalizePluginId before
   phantom-entry filtering so that bundled plugin aliases and legacy IDs
   are correctly excluded. Without this, valid allow entries that resolve
   via alias normalization could generate false-positive phantom warnings.

2. Surface a skills.workspace.scan_truncated warn finding when the BFS
   visit cap (MAX_TOTAL_DIR_VISITS) is hit mid-traversal. Previously the
   scanner silently returned partial results, allowing escaped SKILL.md
   symlinks in the unvisited tree to go undetected.

   listWorkspaceSkillMarkdownFiles now returns {skillFilePaths, truncated}
   and collectWorkspaceSkillSymlinkEscapeFindings emits the new finding
   when truncated is true.

Regression test added for the truncation path using a mocked readdir
that fills the queue past the cap (40 001 fake entries) and a mocked
realpath for zero-I/O iteration speed.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-04-16 13:40:05 -04:00
Peter Steinberger
1d27e0ef08 test: keep discord channel actions on public test SDK 2026-04-16 10:28:22 -07:00
Peter Steinberger
b31d243c57 fix: stabilize skills prompt ordering (#64198) (thanks @Bartok9) 2026-04-16 10:28:22 -07:00
Bartok
c4488d5ef5 fix: pin localeCompare to 'en' locale for cross-environment stability
Addresses review feedback: localeCompare without a fixed locale uses the
runtime default, which varies across servers. Pinning 'en' ensures
byte-identical prompts for cache stability. Applied at all three sort
points in workspace.ts.
2026-04-16 10:28:22 -07:00
Bartok Moltbot
a4b94f77b9 fix(skills): sort available_skills alphabetically for prompt cache stability
Sort the merged skill entries by name before rendering into the
available_skills prompt block. Previously the order depended on
Map insertion order which varies with skills.load.extraDirs config,
causing identical deployments to produce different prompts and bypass
LLM prompt caching.

Two sort points added:
1. loadSkillEntries — canonical ordering at the source
2. resolveWorkspaceSkillPromptState — ensures prompt stability even
   when callers pass pre-built entry arrays

Fixes #64167
2026-04-16 10:28:22 -07:00
Omar Shahine
77d9fd693f fix(bluebubbles): restore inbound image attachments and accept updated-message events (#67510)
* fix(bluebubbles): restore inbound image attachments and accept updated-message events

Four interconnected fixes for BlueBubbles inbound media:

1. Strip bundled-undici dispatcher from non-SSRF fetch path so attachment
   downloads no longer silently fail on Node 22+ (#64105, #61861)

2. Accept updated-message webhook events that carry attachments instead of
   filtering them as non-reaction events (#65430)

3. Include eventType in the persistent GUID dedup key so updated-message
   follow-ups are not rejected as duplicates of the original new-message (#52277)

4. Retry attachment fetch from BB API (2s delay) when the initial webhook
   arrives with an empty attachments array — image-only messages and
   updated-message events only (#67437)

Closes #64105, closes #61861, closes #65430.

* fix(bluebubbles): resolve review findings — SSRF policy, reuse extractAttachments, add tests

- F1 (BLOCKER): pass undefined instead of {} for SSRF policy when
  allowPrivateNetwork is false, so localhost BB servers are not blocked.
- F2 (IMPORTANT): reuse exported extractAttachments() from monitor-normalize
  instead of duplicating field extraction logic.
- F3 (IMPORTANT): simplify asRecord(asRecord(payload)?.data) to
  asRecord(payload.data) since payload is already Record<string, unknown>.
- F4 (NIT): bind retryMessageId before the guard to eliminate non-null assertion.
- F5 (IMPORTANT): add 4 tests for fetchBlueBubblesMessageAttachments covering
  success, non-ok HTTP, empty data, and guid-less entries.
- Add CHANGELOG entry for the user-facing fix.

* fix(ci): update raw-fetch allowlist line number after dispatcher strip

* fix(bluebubbles): resolve PR review findings (#67510)

- monitor-processing: move attachment retry into the !rawBody guard so
  image-only new-message events that arrive with empty attachments and
  empty text are recovered via a BB API refetch before being dropped.
  The existing retry block at the end of processMessageAfterDedupe was
  unreachable for this case because the !rawBody early-return fired
  first. (Greptile)
- monitor: derive isAttachmentUpdate from the normalized message shape
  instead of raw payload.data.attachments so updated-message webhooks
  with attachments under wrapper formats (payload.message, JSON-string
  payloads) are correctly routed through for processing instead of
  silently filtered. (Codex)
- types: use bundled-undici fetch when init.dispatcher is present so
  the SSRF guard's DNS-pinning dispatcher is preserved when this
  function is called as fetchImpl from guarded callers (e.g. the
  attachment download path via fetchRemoteMedia). Falls back to
  globalThis.fetch when no dispatcher is present so tests that stub
  globalThis.fetch keep working. (Codex)
- attachments: blueBubblesPolicy returns undefined for the non-private
  case (matching monitor-processing's helper) so sendBlueBubblesAttachment
  stops routing localhost BB through the SSRF guard. (Greptile)
- scripts/check-no-raw-channel-fetch: bump the types.ts allowlist line
  to match the restructured non-SSRF branch.

* fix(bluebubbles): move attachment retry before rawBody guard, fix stale log

Move the attachment retry block (2s BB API refetch for empty attachments)
before the !rawBody early-return guard. Previously, image-only messages
with text='' and attachments=[] would be dropped by the !rawBody check
before the retry could fire, making fix #4 dead code for its primary
use-case. Now the retry runs first and recomputes the placeholder from
resolved attachments so rawBody becomes non-empty when media is found.

Also fix stale log message that still said 'without reaction' after the
filter was expanded to pass through attachment updates.

* fix(bluebubbles): revert undici import, restore dispatcher-strip approach

Revert the @claude bot's undici import in types.ts — it introduced a
direct 'undici' dependency that is not declared in the BB extension's
package.json and would break isolated plugin installs. Restore the
original dispatcher-strip approach which is correct: the SSRF guard
already completed validation upstream before calling this function as
fetchImpl, so stripping the dispatcher does not weaken security.

* fix(bluebubbles): remove dead empty-body recovery block in !rawBody guard

The empty-body attachment-recovery block added in the earlier PR revision
is now redundant because the main retry block was moved above the rawBody
computation in 0d7d1c4208. Worse, that leftover block reassigned the
(now-const) placeholder variable, throwing `TypeError: Assignment to
constant variable` at runtime for image-only messages — breaking the very
recovery path it was meant to protect (flagged by Codex on 4bfc2777).

Remove the dead block; the up-front retry already handles the image-only
case by recovering attachments before the rawBody computation, so once we
reach the !rawBody guard with an empty body it is genuinely empty and
should drop as before.

* fix(ci): update raw-fetch allowlist line after dispatcher-strip revert

279dba17d2 reverted types.ts back to the dispatcher-strip approach,
which put the `fetch(url, ...)` call at line 189 instead of line 198.
Bump the allowlist entry to match so `lint:tmp:no-raw-channel-fetch`
stops failing check-additional.

* test(pdf-tool): update stale opus-4-6 constant to opus-4-7

`628b454eff feat: default Anthropic to Opus 4.7` bumped the bundled
anthropic image default to `claude-opus-4-7` but missed updating the
`ANTHROPIC_PDF_MODEL` constant in pdf-tool.model-config.test.ts. The
tests now fail on any PR that runs the `checks-node-agentic-agents-plugins`
shard because the resolver returns 4-7 while the test asserts 4-6.

Bump the constant to 4-7 to match the bundled default.

---------

Co-authored-by: Lobster <10343873+omarshahine@users.noreply.github.com>
2026-04-16 10:04:20 -07:00
Peter Steinberger
6429fa0a7f test: keep prompt cache PR gate green 2026-04-16 09:41:26 -07:00
Ted Li
eb10803691 fix(prompt): keep inbound chat ids out of system prefix 2026-04-16 09:41:26 -07:00
Peter Steinberger
1183832d4f fix: pin codex resume sandbox override 2026-04-16 17:31:41 +01:00
Peter Steinberger
d842ec4179 fix: tighten delivery mirror dedupe (#67185) (thanks @andyylin) 2026-04-16 09:20:01 -07:00
Andy
e95efa4373 fix(sessions): dedupe redundant delivery mirrors 2026-04-16 09:20:01 -07:00
Peter Steinberger
86f108401b fix: share agent harness runtime activation (#67474) 2026-04-16 09:06:45 -07:00
duqaXxX
f4bbd0122a test(plugins): remove useless spread in startup config fixture 2026-04-16 09:06:45 -07:00
duqaXxX
69ba924b53 fix(codex): activate harness plugin for forced runtime 2026-04-16 09:06:45 -07:00
Ayaan Zaidi
16c608e393 fix: harden cron announce NO_REPLY suppression (#65016) (thanks @BKF-Gitty) 2026-04-16 21:36:43 +05:30
Peter Steinberger
1d41ef724a test: isolate tts provider auto-selection env 2026-04-16 17:05:36 +01:00
Peter Steinberger
892baf2e81 test: align PDF tool expectations with Opus 4.7 2026-04-16 08:56:56 -07:00
Peter Steinberger
99dfc1b616 docs: add changelog for Codex Desktop user agents (#64666) (thanks @cyrusaf) 2026-04-16 08:56:56 -07:00
Cyrus Forbes
728295c046 Codex: parse Desktop app-server user agents 2026-04-16 08:56:56 -07:00
Gustavo Madeira Santana
d74533c718 Docs: remove QA Matrix changelog entry 2026-04-16 11:42:33 -04:00