Commit Graph

121 Commits

Author SHA1 Message Date
scotthuang
7920af0c9e refactor: route browser screenshot vision through shared media understanding
* feat(browser): add optional vision understanding to screenshot tool

* fix(browser): wrap vision output as external content, enforce maxBytes, forward auth profiles

* fix(browser): remove no-op scope/attachments config, drop profile pass-through lacking runtime support

* feat(media-understanding): add profile/preferredProfile to DescribeImageFileWithModelParams and forward to describeImage

* style(browser): add curly braces to satisfy eslint curly rule

* fix(browser): correct tools.browser.enabled help text to match actual behavior

* fix(browser): thread agentDir/workspaceDir from plugin tool context into browser vision

* refactor(browser): move vision config from tools.browser to browser.models

The browser plugin's vision configuration now lives on the top-level
`browser` config namespace (browser.models, browser.visionEnabled,
browser.visionPrompt, etc.) instead of `tools.browser`. This aligns
with the plugin's existing config location and avoids confusion between
tool-level and plugin-level settings.

- Remove tools.browser from ToolsSchema and ToolsConfig
- Add models/vision* fields to BrowserConfig and its zod schema
- Update getBrowserVisionConfig to read from cfg.browser
- Update schema help, labels, and quality test
- Update vision.test.ts to use new config shape

* docs(browser): add screenshot vision configuration section

Document the new browser.models config for automatic screenshot
description via vision models, enabling text-only main models to
reason about web page content.

* fix(browser): remove deliverable media markers from vision result, drop unused import

P1: Vision-success path no longer exposes the raw screenshot as
deliverable media (removes MEDIA: line and details.media.mediaUrl).
This prevents channel delivery from auto-sending sensitive page content
when the intended output is a text description.

P2: Remove unused ToolsMediaUnderstandingSchema import that would fail
noUnusedLocals typecheck.

* fix(browser): add command/args fields to browser models schema

The browser vision model schema uses .strict(), so CLI-type entries
with command/args were rejected by TypeScript. Add these fields to
align with MediaUnderstandingModelSchema.

* chore(browser): remove debug console.log statements

* fix(browser): harden screenshot vision result against MEDIA: directive injection and restore image sanitization on failure fallback

ClawSweeper #84247 review round 2:

P1 (security, high): neutralize line-start MEDIA: directives in vision descriptions
before wrapping with wrapExternalContent. The agent media extractor scans every
browser tool-result text block via splitMediaFromOutput which treats line-start
MEDIA: as a trusted local-media delivery directive, and browser is on the
trusted-media allowlist. Without neutralization, page or vision-provider output
containing 'MEDIA:/tmp/secret.png' could synthesize a channel-deliverable media
artifact from untrusted content. wrapExternalContent itself does not strip
line-start directives. Introduce neutralizeMediaDirectives in vision.ts that
prepends '[neutralized] ' to any line whose trimStart() begins with MEDIA:
(case-insensitive), defanging the parser anchor while keeping the original
text human-readable.

P2 (compatibility): pass resolveRuntimeImageSanitization() to imageResultFromFile
in the vision-failure catch fallback. The non-vision screenshot path already
forwards this option (d5cc0d53b7) so configured agents.defaults.imageMaxDimensionPx
takes effect. Without this fix, any provider timeout/error silently bypasses the
sanitization guard and returns a raw full-resolution screenshot.

Regression coverage:
- vision.test.ts: 6 unit cases for neutralizeMediaDirectives (no-op fast path,
  mid-line MEDIA: untouched, line-start defanged, leading-whitespace defanged,
  case-insensitive, multiple directives per blob).
- browser-tool.test.ts: 2 integration cases that drive the full screenshot
  tool execute path:
    - 'neutralizes MEDIA: directives in vision text and does not attach media'
      asserts no line matches /^\s*MEDIA:/i in returned text, secret path text
      is preserved verbatim, details.media is absent, and imageResultFromFile
      is not called on the success path.
    - 'preserves screenshot image sanitization on vision failure fallback'
      mocks describeImageFileWithModel to reject and asserts the fallback
      imageResultFromFile call receives imageSanitization: {maxDimensionPx:1600}
      plus the 'browser screenshot vision failed' extraText.

* fix(browser): apply clawsweeper fallback media fix from PR #84247

* refactor: reuse media image understanding for browser screenshots

* refactor: use structured media delivery

* test: update music completion media instruction expectation

* fix: trim buffered reply directive padding

* test: refresh codex prompt snapshots for message media aliases

---------

Co-authored-by: scotthuang <scotthuang@tencent.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-05-31 00:00:19 +01:00
Lucas Giordano
eb7e237151 docs(browser): add Notte cloud browser to direct WebSocket CDP providers
Notte exposes a CDP-compatible WebSocket gateway at
wss://us-prod.notte.cc/sessions/connect?token=<NOTTE_API_KEY> that
auto-creates a session on connect — the same shape OpenClaw's existing
"Direct WebSocket CDP providers" section was generically framed for
(per #31085).

Real behaviour proof (against wss://us-prod.notte.cc/sessions/connect):

  $ openclaw browser --browser-profile notte open https://example.com
  opened: https://example.com/
  tab: t4
  id: 7FE04AC44931A6E1C799DE4ABF0DC807

A screenshot captured against the same session is a 1254x1111 PNG of
the rendered example.com page.

Playwright connectOverCDP flow against the same URL (today):

  connectOverCDP                                      695ms
  context.newCDPSession(page)                         169ms
  session.send('Target.getTargetInfo') → targetId     87ms
  page.goto('https://example.com')                    631ms
  total                                               1.8s

AI-assisted (Claude Opus 4.7). codex review --base origin/main returned
clean. See PR description for the full pre-flight checklist.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 22:17:32 +02:00
Matthew Kern
a37ebb2d49 fix(browser): bypass managed proxy for loopback CDP
Keep browser CDP managed-proxy bypasses on the private bundled-plugin SSRF helper, strip WebSocket URL credentials before registering exact bypass URLs, and document the managed-browser loopback proxy behavior.

Co-authored-by: Matthew Kern <matthew@matthewkern.xyz>
2026-05-23 23:53:27 +01:00
Peter Steinberger
64d13c017a docs: refresh contributor docs
Co-authored-by: Quratulain-bilal <umayaimanshah@gmail.com>
Co-authored-by: Mariano Belinky <mbelinky@gmail.com>
Co-authored-by: tao <itaofe@gmail.com>
Co-authored-by: julian <julian@tencent.com>
Co-authored-by: xenouzik <xenouziq@gmail.com>
Co-authored-by: Olamiposi <56056759+posigit@users.noreply.github.com>
Co-authored-by: surlymochan <surlymo@apache.org>
Co-authored-by: Janaka A <contact@janaka.co.uk>
Co-authored-by: choiking <samsamuels1927@gmail.com>
2026-05-22 22:58:27 +01:00
Peter Steinberger
7afac6015f feat(browser): surface observed dialogs (#83099) 2026-05-18 00:05:29 +01:00
Ayaan Zaidi
d40e062800 docs(browser): note Docker Chromium autodetect 2026-05-10 11:37:37 +05:30
the sun gif man
d4b4660026 config: stop automatic writes and guard Nix mutators (#78047)
Keep startup-derived plugin enablement, gateway auth tokens, control UI origins, and owner-display secrets runtime-only instead of persisting them into openclaw.json.

Refuse config writers, mutating update/plugin lifecycle commands, and doctor repair/token generation in Nix mode with agent-first nix-openclaw guidance.

Verification:
- pnpm check
- pnpm build
- pnpm test -- src/config/io.write-config.test.ts src/config/mutate.test.ts src/config/io.owner-display-secret.test.ts src/gateway/server-startup-config.recovery.test.ts src/gateway/startup-auth.test.ts src/gateway/startup-control-ui-origins.test.ts src/cli/plugins-cli.install.test.ts src/cli/plugins-cli.policy.test.ts src/cli/plugins-cli.uninstall.test.ts src/cli/plugins-cli.update.test.ts src/cli/update-cli.test.ts src/auto-reply/reply/commands-plugins.install.test.ts src/auto-reply/reply/commands-plugins.test.ts src/commands/onboarding-plugin-install.test.ts src/commands/doctor.runs-legacy-state-migrations-yes-mode-without.e2e.test.ts src/commands/doctor/shared/codex-route-warnings.test.ts src/commands/doctor/repair-sequencing.test.ts src/agents/auth-profile-runtime-contract.test.ts src/auto-reply/reply/agent-runner-execution.test.ts
- GitHub CI green on 05a2c71b90

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 14:43:32 +02:00
Vincent Koc
7a39551685 docs: typography hygiene + 2 in-body H1 removals across 5 pages
Replaced 92 typography characters (curly quotes, apostrophes, em/en
dashes, non-breaking hyphens) with ASCII equivalents per
docs/CLAUDE.md heading and content hygiene rules.

- docs/channels/feishu.md: 19 chars; removed the duplicate
  '# Feishu / Lark' H1 (Mintlify renders title from frontmatter; the
  in-body H1 with a slash produced a brittle anchor).
- docs/gateway/bonjour.md: 18 chars; removed the duplicate
  '# Bonjour / mDNS discovery' H1.
- docs/channels/matrix.md: 19 chars
- docs/tools/browser.md: 18 chars
- docs/automation/standing-orders.md: 18 chars
2026-05-05 19:54:53 -07:00
Peter Steinberger
b4b21cbc93 fix(browser): circuit-break managed launch failures 2026-04-27 09:58:14 +01:00
Peter Steinberger
f97cc58760 fix(browser): auto-start configured browser plugin 2026-04-27 09:37:10 +01:00
Vincent Koc
41268ded2d docs: full-page sentence-case sweep across 5 worst-offender pages
- channels/msteams: 8 H2/H3 (Federated Authentication, Local Development, Known Limitations, Reply Style, Presentation Cards, Private Channels, etc.)
- auth-credential-semantics: 4 H2 (Stable Probe Reason Codes, Token Credentials, Explicit Auth Order Filtering, Probe Target Resolution)
- tools/browser: preserve brand-named headings (Browserless, WebSocket CDP, Chrome MCP, Control API, Brave); minor cleanup
- security/CONTRIBUTING-THREAT-MODEL: 4 H2/H3 (What We Use, Risk Levels, Review Process; Threat IDs preserved as branded label)
- gateway/multiple-gateways: 4 H2 (Best Recommended Setup, Why This Works, General Multi-Gateway Setup, Isolation Checklist)
2026-04-26 23:58:35 -07:00
Peter Steinberger
ed1ac2fc44 feat(browser): add CDP role snapshot fallback 2026-04-26 04:40:26 +01:00
Quratulain-bilal
7d58362f3f docs(browser): note tilde expansion also covers per-profile paths (#71601)
* docs(browser): note tilde expansion also covers per-profile paths

The 95a2c9b fix expanded "~" for both `browser.executablePath` and
per-profile `profiles.<name>.executablePath` (config.ts:382 calls
`normalizeExecutablePath` for profile overrides). Per-profile
`userDataDir` on existing-session profiles is also tilde-expanded
(config.ts:391 via `resolveUserPath`). The configuration reference
only mentioned the top-level `browser.executablePath` case.

* docs(browser): align tilde path config help

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 20:05:03 +01:00
Quratulain-bilal
8170df9127 docs(browser): document local startup timeout bounds (#71672)
* docs(browser): document local startup timeout bounds

The new browser.localLaunchTimeoutMs and browser.localCdpReadyTimeoutMs
options are clamped to MAX_BROWSER_STARTUP_TIMEOUT_MS (120000 ms) by
normalizeStartupTimeoutMs in extensions/browser/src/browser/config.ts,
and zero/negative/non-finite values fall back to the defaults. Without
this in the configuration reference, users setting a higher value see
no error and silently get the 120 s ceiling, or set 0 expecting 'no
timeout' and silently get the default.

* docs(browser): clarify startup timeout validation

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-04-25 19:59:53 +01:00
Peter Steinberger
617e1dd6bf fix(browser): honor remote CDP open timeouts 2026-04-25 18:52:57 +01:00
Peter Steinberger
88df8fe09d fix(browser): clarify Browserless CDP attach handling 2026-04-25 18:26:57 +01:00
Peter Steinberger
f3ba962fd0 fix(subagents): explain browser tool profile filtering 2026-04-25 17:59:05 +01:00
Vincent Koc
9b1dd9e573 docs(browser): document Chrome MCP per-profile mcpCommand/mcpArgs and cdpUrl mapping
Vincent's commit ab1d1a5c9e (#71560) added user-facing config keys to
existing-session profiles for the Chrome DevTools MCP launch path:

- browser.profiles.<name>.mcpCommand
- browser.profiles.<name>.mcpArgs

Plus runtime behavior changes:

- cdpUrl http(s) -> --browserUrl, cdpUrl ws(s) -> --wsEndpoint
- endpoint flags and userDataDir are mutually exclusive

The CHANGELOG entry covered the change, but docs/tools/browser.md
existing-session reference did not. Add a 'Custom Chrome MCP launch'
subsection describing the new fields and the cdpUrl endpoint mapping
rules.
2026-04-25 05:54:54 -07:00
Peter Steinberger
b2b898c2a8 feat(browser): configure local startup timeouts 2026-04-25 12:30:35 +01:00
Peter Steinberger
c52ec520c7 feat(browser): add one-shot headless start override 2026-04-25 11:42:03 +01:00
Peter Steinberger
9b48e4c0b6 fix(browser): fall back to headless on Linux without display 2026-04-25 11:13:42 +01:00
Peter Steinberger
5ac36c9719 fix(browser): detect more Linux Chromium installs (#48563)
Co-authored-by: Catalin Lupuleti <105351510+lupuletic@users.noreply.github.com>
2026-04-25 09:12:09 +01:00
Peter Steinberger
5376a4a5d6 fix(browser): default act timeout budget
Co-authored-by: Andy Lin <andyylin@users.noreply.github.com>
2026-04-25 08:11:48 +01:00
Peter Steinberger
209d50b52c feat(browser): add coordinate click action
Co-authored-by: Daniel Lutts <daniellutts@10-19-94-204.dynapool.wireless.nyu.edu>
2026-04-25 07:31:33 +01:00
Vincent Koc
c948c63bbd docs: unify casing and replace path-as-text links across recent doc surfaces
Sweep recent (last ~5h) doc edits for two readability/uniformity issues:

- Replace 42 path-as-text links of the form '[/foo/bar](/foo/bar)' with
  descriptive labels derived from each target page's frontmatter title
  (e.g. '[Anthropic]', '[Token use and costs]', '[OpenAI-compatible
  endpoints]'). Affected files include gateway/troubleshooting,
  concepts/oauth, reference/session-management-compaction, and
  reference/transcript-hygiene.
- Sentence-case Title-Cased headings and link text in Related sections
  across codex-harness, model-providers, tools/plugin, sdk-runtime,
  sdk-setup, prompt-caching, ci, cli/config, google-meet, browser,
  rich-output-protocol, subagents, web/control-ui, while preserving
  brand and proper-noun capitalization (OpenAI, Codex, Chrome, Parallels,
  Z.AI, etc.).
2026-04-24 22:18:22 -07:00
Peter Steinberger
b0e834b2d9 fix(browser): support per-profile executable paths
Co-authored-by: nobrainer-tech <nobrainer-tech@users.noreply.github.com>
2026-04-25 05:50:20 +01:00
Peter Steinberger
a98a0b94d1 fix: isolate browser proxy routing
Co-authored-by: Sanjays2402 <Sanjays2402@users.noreply.github.com>
2026-04-25 03:49:06 +01:00
Peter Steinberger
95a2c9bcdc fix: expand browser executable home paths 2026-04-25 03:16:14 +01:00
Peter Steinberger
ae5c657367 fix: clean up idle browser tabs 2026-04-25 03:08:24 +01:00
Peter Steinberger
d610e2cc6c feat(browser): support per-profile headless
Co-authored-by: nakamotoliu <nakamotoliu2026@gmail.com>
Co-authored-by: Nakamoto <nakamoto@claude.ai>
2026-04-25 01:49:22 +01:00
Peter Steinberger
30aa1f890a feat(browser): expose doctor diagnostics to agents
Co-authored-by: Sean Coley <github@seancoley.me>
2026-04-25 01:15:31 +01:00
Peter Steinberger
60e7b692cc docs(browser): document inspection diagnostics 2026-04-25 00:56:35 +01:00
Peter Steinberger
82020bd787 feat(browser): prefer suggested tab targets 2026-04-25 00:35:26 +01:00
Peter Steinberger
dea05aae6b docs(browser): explain automation skill and tab handles 2026-04-25 00:24:33 +01:00
Peter Steinberger
a437666a37 fix(browser): reject existing-session type timeouts 2026-04-24 08:29:25 +01:00
Vincent Koc
743b69d307 docs(tools): split browser docs by extracting control API and CLI reference 2026-04-23 19:34:50 -07:00
Vincent Koc
4a2cd533ac docs: remove duplicate H1 where frontmatter title already sets it 2026-04-23 13:11:14 -07:00
Vincent Koc
f1662bff92 docs(tools): restructure browser reference with tabs/accordions and trim redundant prose 2026-04-23 08:34:16 -07:00
Peter Steinberger
b648830632 fix: clarify browser playwright-core install guidance 2026-04-22 21:47:58 +01:00
Mariano
8cb73844c8 browser: route existing-session user profile through browser nodes (#68891)
* browser: route user profile through browser nodes

* browser: align existing-session node docs

* browser: preserve host fallback on node discovery errors

* browser: preserve configured node pin errors

* browser: widen config mock in node pin test
2026-04-19 12:21:23 +02:00
Viz
4cfc8cd5be fix(browser): discover CDP websocket from bare ws:// URL before attach (#68715)
* fix(browser): discover CDP websocket from bare ws:// URL before attach

When browser.cdpUrl is set to a bare ws://host:port (no /devtools/ path), ensureBrowserAvailable would call isChromeReachable -> canOpenWebSocket against the URL verbatim. Chrome only accepts WebSocket upgrades at the specific path returned by /json/version, so the handshake failed immediately with HTTP 400. With attachOnly: true, that surfaced as:

  Browser attachOnly is enabled and profile "openclaw" is not running.

even though the CDP endpoint was reachable and the profile was healthy. Reproduced by the new tests in chrome.test.ts and cdp.test.ts (#68027).

Fix: introduce isDirectCdpWebSocketEndpoint(url) — true only when a ws/wss URL has a /devtools/<kind>/<id> handshake path. Route any other ws/wss cdpUrl (including the bare ws://host:port shape) through HTTP /json/version discovery by normalising the scheme via the existing normalizeCdpHttpBaseForJsonEndpoints helper. Apply this in isChromeReachable, getChromeWebSocketUrl, and createTargetViaCdp. Direct WS endpoints with a /devtools/ path are still opened without an extra discovery round-trip.

Fixes #68027

* test(browser): add seeded fuzz coverage for CDP URL helpers

Adds property-based / seeded-fuzz tests for the URL helpers the
attachOnly CDP fix depends on (#68027):

  - isWebSocketUrl
  - isDirectCdpWebSocketEndpoint
  - normalizeCdpHttpBaseForJsonEndpoints
  - parseBrowserHttpUrl
  - redactCdpUrl
  - appendCdpPath
  - getHeadersWithAuth

Follows the existing repo convention (see
src/gateway/http-common.fuzz.test.ts): no fast-check dep, small
mulberry32 PRNG + hand-rolled generators, deterministic per-describe
seeds so failures are reproducible.

Lifts cdp.helpers.ts coverage from 77.77% -> 89.54% statements,
67.9% -> 80.24% branches, 78% -> 90% lines. Remaining uncovered
lines are inside the WS sender internals (createCdpSender,
withCdpSocket, fetchCdpChecked rate-limit branch), which require
integration-style mocks and are unrelated to the attachOnly fix.

* test(browser): drive cdp.helpers/cdp/chrome to 100% coverage

Lifts the three files touched by the #68027 attachOnly fix to 100% statements/branches/functions/lines across the extensions test suite. Adds cdp.helpers.internal.test.ts, cdp.internal.test.ts, and chrome.internal.test.ts covering error paths, branch matrices, CDP session helpers, Chrome spawn/launch/stop flows, and canRunCdpHealthCommand. Defensively unreachable guards are annotated with c8 ignore + inline justifications.

* fix(browser): restore WS fallback for non-/devtools ws:// CDP URLs

When /json/version discovery is unavailable (or returns no
webSocketDebuggerUrl), fall back to treating the original bare ws/wss
URL as a direct WebSocket endpoint. This preserves the #68027 fix for
Chrome's debug port while restoring compatibility with Browserless/
Browserbase-style providers that expose a direct WebSocket root without
a /json/version endpoint.

Priority order for bare ws/wss cdpUrl inputs:
  1. /devtools/<kind>/<id> URL \u2192 direct handshake, no discovery (unchanged)
  2. bare ws/wss root \u2192 try HTTP discovery first; if discovery returns a
     webSocketDebuggerUrl use it; otherwise fall back to the original URL
     as a direct WS endpoint
  3. HTTP/HTTPS URL \u2192 HTTP discovery only, no fallback (unchanged)

Affected call sites: isChromeReachable, getChromeWebSocketUrl,
createTargetViaCdp.

Also renames a misleading test ('still enforces SSRF policy for direct
WebSocket URLs') to accurately describe what it tests: SSRF enforcement
on the navigation target URL, not on the CDP endpoint.

New tests added for all three fallback paths. Coverage remains 100% on
all three touched files (238 tests).

* fix: browser attachOnly bare ws CDP follow-ups (#68715) (thanks @visionik)
2026-04-19 05:43:39 -04:00
Mason Huang
7eecfa411d fix(browser): unblock loopback CDP readiness under strict SSRF defaults (#66354)
Merged via squash.

Prepared head SHA: d9030ff2f0
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Reviewed-by: @hxy91819
2026-04-14 16:30:43 +08:00
Agustin Rivera
905f19230a Align external marker span mapping (#63885)
* fix(markers): align external marker spans

* fix(browser): ssrfPolicy defaults fail-closed for unconfigured installs (GHSA-53vx-pmqw-863c)

* fix(browser): enforce strict default SSRF policy

* chore(changelog): add browser SSRF default + marker alignment entry

---------

Co-authored-by: Devin Robison <drobison@nvidia.com>
2026-04-10 12:35:20 -06:00
Josh Avant
f096fc4406 Browser: unify /act route action execution and contract errors (#63977)
* Browser: unify agent act route execution and contracts

* Browser tests: lock act error codes and dedupe harness dispatch

* Browser tests: slim act harness dispatch map

* Browser act: enforce top-level targetId match

* Browser tests: cover missing act error codes

* Browser act: restore wait cap and reject zero resize dims

* Docs: document /act error contract

* Browser act: lock selector precedence and positive resize validation

* Browser act: restore interaction cap and harden contract tests

* docs: note browser act contract consolidation (#63977) (thanks @joshavant)
2026-04-09 22:54:33 -05:00
Peter Steinberger
8efb0801a0 docs: refresh browser existing-session limit refs 2026-04-04 20:26:30 +01:00
Peter Steinberger
022618e887 docs: refresh browser auth refs 2026-04-04 14:04:24 +01:00
Peter Steinberger
29f062770d docs: refresh browser stop cleanup refs 2026-04-04 12:02:10 +01:00
Peter Steinberger
0bc9f0b5ba docs: refresh browser screenshot route refs 2026-04-04 11:58:46 +01:00
Peter Steinberger
53d3fbcef6 docs: refresh browser existing session docs 2026-04-04 11:51:07 +01:00
Vincent Koc
74830c7bac docs: add Related sections to 6 major tool pages
Add cross-linking Related sections to tool pages that were dead ends:
- exec, exec-approvals, browser, pdf, skills, lobster

Each page now links to 2-4 related topics for navigation continuity.
2026-03-31 14:34:56 +09:00