Commit Graph

19725 Commits

Author SHA1 Message Date
Tak Hoffman
81818df1b4 fix(startup): prioritize bootstrap on fresh sessions 2026-04-16 17:53:07 -05:00
Peter Steinberger
b2cae7f12a test: trim duplicate memory hotspot coverage 2026-04-16 23:15:38 +01:00
Peter Steinberger
a98754d504 refactor(agents): clarify prompt cache compatibility gates 2026-04-16 14:59:20 -07:00
Peter Steinberger
d59604b15e test: speed up hotspot boundaries 2026-04-16 22:55:30 +01:00
Peter Steinberger
041266a669 chore: prepare 2026.4.15 release 2026-04-16 22:45:32 +01:00
Peter Steinberger
4d2854a2b0 test: tighten hotspot boundaries 2026-04-16 22:40:06 +01:00
Peter Steinberger
63e53fbf2e test: trim duplicate hotspot coverage 2026-04-16 22:19:32 +01:00
Peter Steinberger
678b019467 test: stabilize config and plugin scanner tests 2026-04-16 22:10:36 +01:00
Onur
3ae5d95bfd CI: fix live Docker auth mounts (#67812)
* CI: fix live Docker auth mounts

* CI: harden live Docker auth mounts
2026-04-16 23:00:11 +02:00
Peter Steinberger
8a37bb4ed6 perf: speed up security audit test imports 2026-04-16 21:54:13 +01:00
Vincent Koc
f835da1667 fix(ci): trim slow task and gateway paths 2026-04-16 13:34:34 -07:00
Gustavo Madeira Santana
56a9fd4b34 QA Matrix: capture full runner output 2026-04-16 16:18:54 -04:00
Gustavo Madeira Santana
21d500a65f test: expose bundled plugin QA test APIs 2026-04-16 16:18:54 -04:00
Peter Steinberger
372c0051ba test: speed up slow import-boundary tests 2026-04-16 21:14:17 +01:00
Devin Robison
8b7d76bfbb fix(compaction): stop retaining credential-like values (#67801) 2026-04-16 14:04:45 -06:00
Chris Yau
59caf03d67 Avoid rescanning HTML challenge pages during error formatting
The HTML challenge fix already keeps standalone CDN block pages out of the DNS transport path. This follow-up caches the HTML classification so status-prefixed non-HTML failures do not pay for the same scan twice and the control flow stays simpler.

Constraint: Keep behavior identical for both status-prefixed HTML pages and standalone HTML challenge pages
Rejected: Inline the helper into the status branch only | would duplicate the standalone HTML branch logic
Confidence: high
Scope-risk: narrow
Directive: If this formatter grows more branches, keep a single HTML classification result and reuse it through the decision tree
Tested: oxfmt --check src/shared/assistant-error-format.ts
Tested: node scripts/test-projects.mjs src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts
2026-04-16 12:47:12 -07:00
Chris Yau
36dd58ac2a Prevent Codex HTML challenge pages from looking like DNS failures
Cloudflare challenge pages from chatgpt.com/backend-api can arrive as raw HTML without an HTTP status prefix. The transport sanitizer scanned for generic "dns" substrings before HTML detection, so these pages could surface as DNS lookup failures instead of the existing HTML/CDN block message.

Constraint: Must preserve DNS transport classification for real ENOTFOUND/getaddrinfo failures
Rejected: Treat every bare HTML document as an upstream HTML error | too broad for arbitrary model text/errors
Confidence: high
Scope-risk: narrow
Directive: Keep standalone HTML challenge detection ahead of generic transport keyword matching so CDN block pages do not regress into DNS copy
Tested: oxfmt --check on changed files; targeted node --import tsx verification for standalone Cloudflare HTML classification and DNS control case
Not-tested: Full Vitest shard run in this environment
2026-04-16 12:47:12 -07:00
Peter Steinberger
ad9da24317 test: keep web search config imports stable 2026-04-16 19:58:08 +01:00
Peter Steinberger
c635efd233 chore: prepare 2026.4.15-beta.2 release 2026-04-16 19:58:08 +01:00
Josh Lehman
a327b6750d fix: stabilize context engine prompt cache touches (#67767)
* fix: stabilize context engine prompt cache touches

* fix(changelog): document context-engine prompt cache touch stabilization
2026-04-16 11:53:42 -07:00
Vincent Koc
ac717a92e8 test(gateway): avoid mapped hook provenance event race 2026-04-16 11:35:14 -07:00
zqchris
82e349a48a memory: strip inbound metadata envelopes from user messages in session corpus (#66548)
Merged via squash.

Prepared head SHA: 98562b2a84
Co-authored-by: zqchris <4436110+zqchris@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-04-16 11:15:44 -07:00
Onur
900e291f31 CI: expand native release validation coverage (#67144)
* Actions: grant reusable release checks actions read

* Actions: use read-all for reusable release checks

* CI: add native cross-OS release checks

* CI: wire Discord smoke secrets for cross-OS checks

* CI: fix native cross-OS installer compatibility

* CI: skip empty pnpm cache saves in matrix jobs

* CI: honor workflow runner override envs

* CI: finish native cross-OS update checks

* CI: fix native cross-OS workflow regressions

* Installer: capture Windows npm stderr safely

* CI: harden cross-OS release checks

* CI: resolve reusable workflow harness ref

* CI: stabilize cross-OS dev update lanes

* CI: tighten release-check workflow semantics

* CI: repoint repaired git CLI on POSIX

* CI: repair native dev-update shell handoff

* CI: preserve real updater semantics

* CI: harden supported release-check refs

* CI: harden release-check refs and fresh mode

* CI: skip dev-update for immutable tag refs

* CI: repair fresh installer release checks

* CI: fix native release check installer lanes

* CI: install release checks from candidate artifacts

* CI: use Windows cmd shims in release checks

* Installer: run Windows npm shim via PowerShell

* CI: pin dev update verification to candidate sha

* CI: pin reusable harness and published installers

* CI: isolate Windows dev-update PATH validation

* CI: align Windows dev-update bootstrap validation

* CI: avoid Windows installer gateway flake

* CI: run cross-OS release checks via TypeScript

* CI: bootstrap tsx for release-check workflow

* CI: fix native release-check follow-ups

* CI: tighten dev-update release checks

* CI: peel annotated workflow refs

* CI: harden native release checks

* CI: fix release-check verifier drift

* CI: fix release-check workflow drift

* CI: fix release-check ref resolution

* CI: harden Windows release-check gateway startup

* CI: fix release-check fallback validation

* CI: harden cross-os release checks

* CI: pin dev-update release checks to candidate SHA

* CI: resolve remote dev target refs

* CI: detect cloned dev-update checkouts

* CI: harden Windows release-check launcher

* Windows: harden task fallback and runner overrides

* Release checks: preserve Windows PATH and baseline version reads

* CI: add release validation live lanes

* CI: expand live and e2e release coverage

* CI: add branch dispatch for live and e2e checks
2026-04-16 19:58:19 +02:00
Daniel Salmerón Amselem
687ede50a5 fix(agents): add prompt cache compatibility opt-out
Add compat.supportsPromptCacheKey for OpenAI Responses prompt_cache_key handling, update generated config baseline, changelog, and A2UI dependency-layout test compatibility.
2026-04-16 10:48:51 -07:00
Viz
f624b1d246 fix(security): 7 P1 hardening fixes — scan-paths, windows-acl, audit-extra (#67003)
* test(security): add coverage tests before security fixes

- scan-paths.ts: 100% line coverage (new test file, previously zero)
- windows-acl.ts: 100% line coverage (SID bypass, whoami throw, no-user null return)
- external-content.ts: 99% (line 248 defensive overlap guard, unreachable)
- skill-scanner.ts: 93% (lines 293-294/330/571 are defensive guards for
  future extensibility, unreachable with current rules/patterns)

200+ tests covering TOCTOU paths, cache invalidation, forced-file escapes,
dir-entry-cache hit, SID world-bypass, diacritic-strip fallback,
fullwidth homoglyph markers, and more.

* fix(security): 5 security hardening fixes in src/security/

scan-paths: default requireRealpath to false (safe). All production callers
already pass requireRealpath: true; default callers are now secure.

windows-acl: block world-equivalent SIDs (S-1-1-0 Everyone etc.) from being
added to trusted set via USERSID env var.

windows-acl: log resolveCurrentUserSid failures instead of bare catch{}.

audit-extra: wrap JSON.parse in readPluginManifestExtensions with try-catch.
Malformed package.json returns [] instead of crashing the audit.

audit-extra: depth guard in listWorkspaceSkillMarkdownFiles to prevent
resource exhaustion from deep symlink cycles.

audit-extra: 2s timeout on fs.realpath in collectWorkspaceSkillSymlinkEscapeFindings
to protect against hanging on slow/network filesystems.

audit-extra: warn about phantom entries in plugins.allow that don't match
any installed plugin (pre-approval exploitation vector).

media-understanding/types: add allowPrivateNetwork to transport overrides
(duplicate of PR #66967, required for tsgo to pass here).

* fix(security): address security review findings in audit-extra.async.ts

Issue 1 — Symlink escape audit bypass on realpath timeout:
When realpathWithTimeout returns null (timeout or failure), the previous code
called 'continue', silently skipping the escape check. An attacker with a
symlink to a slow/network filesystem could hang realpath to prevent escape
detection. Now treats unverifiable symlinks as potential escapes and includes
them in the finding.

Issue 2 — Malformed package.json hides extension entrypoints from deep scan:
readPluginManifestExtensions previously swallowed JSON.parse errors and
returned [], which a malicious plugin could exploit by crafting a malformed
package.json to hide its openclaw.extensions entrypoints from the deep code
scanner. Now re-throws the parse error (with cause) so the caller in
collectPluginsCodeSafetyFindings can surface a warn finding and alert the
user, while still scanning the plugin directory via getCodeSafetySummary.

* fix(security): address PR review findings (P1 + P2)

P1 — BFS realpath in listWorkspaceSkillMarkdownFiles lacks timeout:
Extract realpathWithTimeout to module scope so the BFS dequeue loop
uses the same 2 s guard as the outer escape-detection callers. Previously
only the per-workspace and per-skill-file realpaths had the timeout;
a hanging NFS/SMB directory entry inside the BFS could still block
indefinitely.

P1 (acknowledged limitation) — Promise.race leaves the underlying
fs.realpath call running after timeout. fs.realpath cannot be cancelled
once submitted to libuv. Callers are sequential (one await at a time),
so at most one worker thread is occupied; the OS will eventually time
out the stuck call. This is documented in the module-level JSDoc.

P2 — Phantom allowlist check incorrectly flags bundled plugin IDs:
listChannelPlugins() returns bundled channel plugin IDs (telegram,
discord, browser, etc.) that are never in stateDir/extensions.
Add bundledPluginIds exclusion so the phantom-entry finding is scoped
to user-installed extension IDs only.

P2 — Rename MAX_SYMLINK_DEPTH / depthGuard to MAX_TOTAL_DIR_VISITS /
totalDirVisits to accurately reflect that the guard caps total BFS
iterations (2_000 * 20 = 40_000), not per-path symlink depth.

* fix(security): clean up realpathWithTimeout timer and add regression tests

- Clear the timer handle when fs.realpath resolves before the deadline,
  preventing timer accumulation during large audit runs with many files.
- Add .unref() on the timer so it cannot hold the process alive while
  waiting on a potentially hanging NFS/SMB path.

Regression tests added for three audit-extra.async security fixes:
- manifest parse error: malformed plugin package.json surfaces
  plugins.code_safety.manifest_parse_error (audit-extra.async.test.ts)
- phantom allowlist with bundled exclusion: bundled channel plugin IDs
  are excluded from plugins.allow_phantom_entries warnings; non-installed
  non-bundled IDs are correctly reported (audit-plugins-phantom.test.ts)
- unverifiable realpath escape: fs.realpath failure / timeout produces a
  skills.workspace.symlink_escape finding with 'realpath timed out' in
  the detail (audit-workspace-skill-escape.test.ts)

* chore(security): add TODO for structured logger in windows-acl resolveCurrentUserSid

console.warn is acceptable short-term but may be noisy on constrained
Windows hosts; note the follow-up in-code so it is not lost.

* chore: drop unrelated formatting churn from security PR

Restores extensions/memory-lancedb/config.ts and
src/agents/pi-embedded-helpers/errors.ts to their origin/main state.
These were line-wrap-only formatting changes with no relation to the
security fixes in this branch.

* fix(security): address Codex P2 review findings

1. Normalize plugins.allow entries through normalizePluginId before
   phantom-entry filtering so that bundled plugin aliases and legacy IDs
   are correctly excluded. Without this, valid allow entries that resolve
   via alias normalization could generate false-positive phantom warnings.

2. Surface a skills.workspace.scan_truncated warn finding when the BFS
   visit cap (MAX_TOTAL_DIR_VISITS) is hit mid-traversal. Previously the
   scanner silently returned partial results, allowing escaped SKILL.md
   symlinks in the unvisited tree to go undetected.

   listWorkspaceSkillMarkdownFiles now returns {skillFilePaths, truncated}
   and collectWorkspaceSkillSymlinkEscapeFindings emits the new finding
   when truncated is true.

Regression test added for the truncation path using a mocked readdir
that fills the queue past the cap (40 001 fake entries) and a mocked
realpath for zero-I/O iteration speed.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-04-16 13:40:05 -04:00
Peter Steinberger
b31d243c57 fix: stabilize skills prompt ordering (#64198) (thanks @Bartok9) 2026-04-16 10:28:22 -07:00
Bartok
c4488d5ef5 fix: pin localeCompare to 'en' locale for cross-environment stability
Addresses review feedback: localeCompare without a fixed locale uses the
runtime default, which varies across servers. Pinning 'en' ensures
byte-identical prompts for cache stability. Applied at all three sort
points in workspace.ts.
2026-04-16 10:28:22 -07:00
Bartok Moltbot
a4b94f77b9 fix(skills): sort available_skills alphabetically for prompt cache stability
Sort the merged skill entries by name before rendering into the
available_skills prompt block. Previously the order depended on
Map insertion order which varies with skills.load.extraDirs config,
causing identical deployments to produce different prompts and bypass
LLM prompt caching.

Two sort points added:
1. loadSkillEntries — canonical ordering at the source
2. resolveWorkspaceSkillPromptState — ensures prompt stability even
   when callers pass pre-built entry arrays

Fixes #64167
2026-04-16 10:28:22 -07:00
Ted Li
eb10803691 fix(prompt): keep inbound chat ids out of system prefix 2026-04-16 09:41:26 -07:00
Peter Steinberger
1183832d4f fix: pin codex resume sandbox override 2026-04-16 17:31:41 +01:00
Peter Steinberger
d842ec4179 fix: tighten delivery mirror dedupe (#67185) (thanks @andyylin) 2026-04-16 09:20:01 -07:00
Andy
e95efa4373 fix(sessions): dedupe redundant delivery mirrors 2026-04-16 09:20:01 -07:00
Peter Steinberger
86f108401b fix: share agent harness runtime activation (#67474) 2026-04-16 09:06:45 -07:00
duqaXxX
f4bbd0122a test(plugins): remove useless spread in startup config fixture 2026-04-16 09:06:45 -07:00
duqaXxX
69ba924b53 fix(codex): activate harness plugin for forced runtime 2026-04-16 09:06:45 -07:00
Ayaan Zaidi
16c608e393 fix: harden cron announce NO_REPLY suppression (#65016) (thanks @BKF-Gitty) 2026-04-16 21:36:43 +05:30
Peter Steinberger
892baf2e81 test: align PDF tool expectations with Opus 4.7 2026-04-16 08:56:56 -07:00
Peter Steinberger
461d0050d9 fix: keep codex resume runs non-interactive (#67666) (thanks @plgonzalezrx8) 2026-04-16 08:41:57 -07:00
Pedro Gonzalez
4c66978591 security(codex): restore sandbox protections for resumed CLI sessions 2026-04-16 08:41:57 -07:00
Peter Steinberger
628b454eff feat: default Anthropic to Opus 4.7 2026-04-16 16:12:06 +01:00
Ayaan Zaidi
75c551e89e fix: harden node-host shell payload mutability checks 2026-04-16 20:34:17 +05:30
tmimmanuel
29919bb6e4 fix: land node-host approval binding for native binaries (#66731) (thanks @tmimmanuel)
* fix(node-host): allow absolute-path native binaries through approval binder

* test(node-host): cover binary binder edge cases

* test(node-host): use stable native binary fixture

* fix(ci): restore fail-closed race handling

* refactor(node-host): distill approval binding regressions

* fix(node-host): fail closed on unknown shell payload headers

* fix: land node-host approval binding for native binaries (#66731) (thanks @tmimmanuel)

* fix: keep relative shell binary payloads fail-closed (#66731) (thanks @tmimmanuel)

* fix: keep shell binary bypass on stable paths only (#66731) (thanks @tmimmanuel)

* fix: fail closed on symlinked shell binary targets (#66731) (thanks @tmimmanuel)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-16 20:30:09 +05:30
Chunyue Wang
8c11210fe5 fix(gateway): capture config hash after plugin auto-enable to prevent restart loop (#67557)
Merged via squash.

Prepared head SHA: 07250958a7
Co-authored-by: openperf <80630709+openperf@users.noreply.github.com>
Co-authored-by: openperf <80630709+openperf@users.noreply.github.com>
Reviewed-by: @openperf
2026-04-16 21:18:24 +08:00
stain lu
c3c7a9953f fix: repair sanitized replay tool results before send (#67620) (thanks @stainlu)
* fix(agents): preserve native Anthropic tool IDs for hybrid providers

Fixes #66892

MiniMax and other hybrid providers use api.minimaxi.com/anthropic
(modelApi: anthropic-messages), which generates and expects native
Anthropic tool_call_ids in toolu_* format. The hybrid replay policy
(buildHybridAnthropicOrOpenAIReplayPolicy) applied strict
sanitization that stripped underscores from these IDs, causing
MiniMax to reject them with error 2013.

The native Anthropic provider already preserved these IDs via
preserveNativeAnthropicToolUseIds (added in 4613f121ad). This
commit enables the same flag for the hybrid anthropic-messages
branch, so toolu_* IDs pass through unsanitized while other
synthetic IDs still get strict cleanup.

* fix(agents): repair sanitized replay tool results before send

* fix: repair sanitized replay tool results before send (#67620) (thanks @stainlu)

* fix: preserve aborted-span tool results during replay sanitize (#67620) (thanks @stainlu)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-16 18:38:57 +05:30
Ayaan Zaidi
de129a6530 fix: restrict HTML timeout short-circuit to transient statuses 2026-04-16 18:33:35 +05:30
Ayaan Zaidi
3525273930 fix: keep TUI watchdog bound to active run (#67401) (thanks @xantorres) 2026-04-16 18:31:56 +05:30
Xan Torres
d7f489f85e Gateway/skills: dedupe skills prefix-match + drop dead fallback on log 2026-04-16 18:31:56 +05:30
Xan Torres
f44ab20d4d TUI/streaming: add watchdog that resets the activity indicator after delta silence 2026-04-16 18:31:56 +05:30
Xan Torres
36ed36768c Agents/tool-loop: enable unknown-tool stream guard by default 2026-04-16 18:31:56 +05:30
Xan Torres
b23d59a522 Gateway/skills: invalidate session skills snapshot on config write 2026-04-16 18:31:56 +05:30