Files
openclaw/docs/cli/browser.md
clawsweeper[bot] 24e729fc4e feat(browser): extend --labels overlay to full-page and element captures (#92834)
Summary:
- The replacement PR extends Browser plugin labeled screenshots to honor Playwright full-page/ref/element scope, returns annotation bounding boxes, and updates docs, tests, and skill guidance.
- PR surface: Source +415, Tests +550, Docs +24. Total +989 across 12 files.
- Reproducibility: yes. Current main source shows the labeled Playwright helper ignores fullPage/ref/element and omits annotations, and the source PR supplies live before/after commands for the Browser plugin path.

Automerge notes:
- PR branch already contained follow-up commit before automerge: docs(browser): correct raw-CDP labels caveat in automation skill
- PR branch already contained follow-up commit before automerge: fix(browser): preserve labelsSkipped semantics for off-viewport refs
- PR branch already contained follow-up commit before automerge: docs(browser): scope labels docs by driver
- PR branch already contained follow-up commit before automerge: docs(browser): fix labels annotation indent and document scope fix
- PR branch already contained follow-up commit before automerge: docs(browser): indent annotations box schema under --labels bullet
- PR branch already contained follow-up commit before automerge: docs(browser): indent labels annotation schema

Validation:
- ClawSweeper review passed for head 70aca6c506.
- Required merge gates passed before the squash merge.

Prepared head SHA: 70aca6c506
Review: https://github.com/openclaw/openclaw/pull/92834#issuecomment-4700431344

Co-authored-by: FMLS <kfliuyang@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
2026-06-14 02:21:23 +00:00

13 KiB

summary, read_when, title
summary read_when title
CLI reference for `openclaw browser` (lifecycle, profiles, tabs, actions, state, and debugging)
You use `openclaw browser` and want examples for common tasks
You want to control a browser running on another machine via a node host
You want to attach to your local signed-in Chrome via Chrome MCP
Browser

openclaw browser

Manage OpenClaw's browser control surface and run browser actions (lifecycle, profiles, tabs, snapshots, screenshots, navigation, input, state emulation, and debugging).

Related:

Common flags

  • --url <gatewayWsUrl>: Gateway WebSocket URL (defaults to config).
  • --token <token>: Gateway token (if required).
  • --timeout <ms>: request timeout (ms).
  • --expect-final: wait for a final Gateway response.
  • --browser-profile <name>: choose a browser profile (default from config).
  • --json: machine-readable output (where supported).

Quick start (local)

openclaw browser profiles
openclaw browser --browser-profile openclaw start
openclaw browser --browser-profile openclaw open https://example.com
openclaw browser --browser-profile openclaw snapshot

Agents can run the same readiness check with browser({ action: "doctor" }).

Quick troubleshooting

If start fails with not reachable after start, troubleshoot CDP readiness first. If start and tabs succeed but open or navigate fails, the browser control plane is healthy and the failure is usually navigation SSRF policy.

Minimal sequence:

openclaw browser --browser-profile openclaw doctor
openclaw browser --browser-profile openclaw start
openclaw browser --browser-profile openclaw tabs
openclaw browser --browser-profile openclaw open https://example.com

Detailed guidance: Browser troubleshooting

Lifecycle

openclaw browser status
openclaw browser doctor
openclaw browser doctor --deep
openclaw browser start
openclaw browser start --headless
openclaw browser stop
openclaw browser --browser-profile openclaw reset-profile

Notes:

  • doctor --deep adds a live snapshot probe. It is useful when basic CDP readiness is green but you want proof that the current tab can be inspected.
  • For attachOnly and remote CDP profiles, openclaw browser stop closes the active control session and clears temporary emulation overrides even when OpenClaw did not launch the browser process itself.
  • For local managed profiles, openclaw browser stop stops the spawned browser process.
  • openclaw browser start --headless applies only to that start request and only when OpenClaw launches a local managed browser. It does not rewrite browser.headless or profile config, and it is a no-op for an already-running browser.
  • On Linux hosts without DISPLAY or WAYLAND_DISPLAY, local managed profiles run headless automatically unless OPENCLAW_BROWSER_HEADLESS=0, browser.headless=false, or browser.profiles.<name>.headless=false explicitly requests a visible browser.

If the command is missing

If openclaw browser is an unknown command, check plugins.allow in ~/.openclaw/openclaw.json.

When plugins.allow is present, list the bundled browser plugin explicitly unless the config already has a root browser block:

{
  plugins: {
    allow: ["telegram", "browser"],
  },
}

An explicit root browser block, for example browser.enabled=true or browser.profiles.<name>, also activates the bundled browser plugin under a restrictive plugin allowlist.

Related: Browser tool

Profiles

Profiles are named browser routing configs. In practice:

  • openclaw: launches or attaches to a dedicated OpenClaw-managed Chrome instance (isolated user data dir).
  • user: controls your existing signed-in Chrome session via Chrome DevTools MCP.
  • custom CDP profiles: point at a local or remote CDP endpoint.
openclaw browser profiles
openclaw browser create-profile --name work --color "#FF5A36"
openclaw browser create-profile --name chrome-live --driver existing-session
openclaw browser create-profile --name remote --cdp-url https://browser-host.example.com
openclaw browser delete-profile --name work

Use a specific profile:

openclaw browser --browser-profile work tabs

Tabs

openclaw browser tabs
openclaw browser tab new --label docs
openclaw browser tab label t1 docs
openclaw browser tab select 2
openclaw browser tab close 2
openclaw browser open https://docs.openclaw.ai --label docs
openclaw browser focus docs
openclaw browser close t1

tabs returns suggestedTargetId first, then the stable tabId such as t1, the optional label, and the raw targetId. Agents should pass suggestedTargetId back into focus, close, snapshots, and actions. You can assign a label with open --label, tab new --label, or tab label; labels, tab ids, raw target ids, and unique target-id prefixes are all accepted. The request field is still named targetId for compatibility, but it accepts these tab references. Treat raw target ids as diagnostic handles, not durable agent memory. When Chromium replaces the underlying raw target during a navigation or form submit, OpenClaw keeps the stable tabId/label attached to the replacement tab when it can prove the match. Raw target ids remain volatile; prefer suggestedTargetId.

Snapshot / screenshot / actions

Snapshot:

openclaw browser snapshot
openclaw browser snapshot --urls

Screenshot:

openclaw browser screenshot
openclaw browser screenshot --full-page
openclaw browser screenshot --ref e12
openclaw browser screenshot --labels

Notes:

  • --full-page is for page captures only; it cannot be combined with --ref or --element.
  • existing-session / user profiles support page screenshots and --ref screenshots from snapshot output, but not CSS --element screenshots.
  • --labels overlays current snapshot refs on the screenshot. On Playwright-backed profiles, it works with --full-page (full-page label overlay), --ref (element-clip label overlay by ARIA ref), and --element (element-clip label overlay by CSS selector); in element-clip modes, labels are projected relative to the element. The response also includes an annotations array with each ref's bounding box. Each item has ref, number, role, optional name, and box: {x, y, width, height}; coordinates are in the captured image's space (viewport / fullpage / element-relative). The field is omitted when empty. existing-session profiles render a chrome-mcp overlay on page screenshots but do not use the Playwright projection helper and do not include annotations; CSS --element screenshots are unsupported there. Without Playwright or chrome-mcp, labeled screenshots are not available. Prior releases ignored --full-page, --ref, and --element on labeled Playwright screenshots and always returned a viewport capture; labeled screenshots now honor those scopes.
  • snapshot --urls appends discovered link destinations to AI snapshots so agents can choose direct navigation targets instead of guessing from link text alone.

Navigate/click/type (ref-based UI automation):

openclaw browser navigate https://example.com
openclaw browser click <ref>
openclaw browser click-coords 120 340
openclaw browser type <ref> "hello"
openclaw browser press Enter
openclaw browser hover <ref>
openclaw browser scrollintoview <ref>
openclaw browser drag <startRef> <endRef>
openclaw browser select <ref> OptionA OptionB
openclaw browser fill --fields '[{"ref":"1","value":"Ada"}]'
openclaw browser wait --text "Done"
openclaw browser evaluate --fn '(el) => el.textContent' --ref <ref>
openclaw browser evaluate --fn 'const title = document.title; return title;'
openclaw browser evaluate --timeout-ms 30000 --fn 'async () => { await window.ready; return true; }'

evaluate --fn accepts a function source, an expression, or a statement body. Statement bodies are wrapped as async functions, so use return for the value you want back. Use evaluate --timeout-ms <ms> when the page-side function may need longer than the default evaluate timeout.

Action responses return the current raw targetId after action-triggered page replacement when OpenClaw can prove the replacement tab. Scripts should still store and pass suggestedTargetId/labels for long-lived workflows.

File + dialog helpers:

openclaw browser upload /tmp/openclaw/uploads/file.pdf --ref <ref>
openclaw browser upload media://inbound/file.pdf --ref <ref>
openclaw browser waitfordownload
openclaw browser download <ref> report.pdf
openclaw browser dialog --accept
openclaw browser dialog --dismiss --dialog-id d1

Managed Chrome profiles save ordinary click-triggered downloads into the OpenClaw downloads directory (/tmp/openclaw/downloads by default, or the configured temp root). Use waitfordownload or download when the agent needs to wait for a specific file and return its path; those explicit waiters own the next download. Uploads accept files from the OpenClaw temp uploads root and OpenClaw-managed inbound media, including media://inbound/<id> and sandbox-relative media/inbound/<id> references. Nested media refs, traversal, and arbitrary local paths remain rejected. When an action opens a modal dialog, the action response returns blockedByDialog with browserState.dialogs.pending; pass --dialog-id to answer it directly. Dialogs handled outside OpenClaw appear under browserState.dialogs.recent.

State and storage

Viewport + emulation:

openclaw browser resize 1280 720
openclaw browser set viewport 1280 720
openclaw browser set offline on
openclaw browser set media dark
openclaw browser set timezone Europe/London
openclaw browser set locale en-GB
openclaw browser set geo 51.5074 -0.1278 --accuracy 25
openclaw browser set device "iPhone 14"
openclaw browser set headers '{"x-test":"1"}'
openclaw browser set credentials myuser mypass

Cookies + storage:

openclaw browser cookies
openclaw browser cookies set session abc123 --url https://example.com
openclaw browser cookies clear
openclaw browser storage local get
openclaw browser storage local set token abc123
openclaw browser storage session clear

Debugging

openclaw browser console --level error
openclaw browser pdf
openclaw browser responsebody "**/api"
openclaw browser highlight <ref>
openclaw browser errors --clear
openclaw browser requests --filter api
openclaw browser trace start
openclaw browser trace stop --out trace.zip

Existing Chrome via MCP

Use the built-in user profile, or create your own existing-session profile:

openclaw browser --browser-profile user tabs
openclaw browser create-profile --name chrome-live --driver existing-session
openclaw browser create-profile --name brave-live --driver existing-session --user-data-dir "~/Library/Application Support/BraveSoftware/Brave-Browser"
openclaw browser create-profile --name chrome-port --driver existing-session --cdp-url http://127.0.0.1:9222
openclaw browser --browser-profile chrome-live tabs

The default existing-session path is host-only Chrome MCP auto-connect. If the browser is already running with a DevTools endpoint, pass --cdp-url so Chrome MCP attaches to that endpoint instead. For Docker, Browserless, or other remote setups where Chrome MCP semantics are not needed, use a CDP profile.

Current existing-session limits:

  • snapshot-driven actions use refs, not CSS selectors
  • browser.actionTimeoutMs defaults supported act requests to 60000 ms when callers omit timeoutMs; per-call timeoutMs still wins.
  • click is left-click only
  • type does not support slowly=true
  • press does not support delayMs
  • hover, scrollintoview, drag, select, fill, and evaluate reject per-call timeout overrides
  • select supports one value only
  • wait --load networkidle is not supported
  • file uploads require --ref / --input-ref, do not support CSS --element, and currently support one file at a time
  • dialog hooks do not support --timeout
  • screenshots support page captures and --ref, but not CSS --element
  • responsebody, download interception, PDF export, and batch actions still require a managed browser or raw CDP profile

Remote browser control (node host proxy)

If the Gateway runs on a different machine than the browser, run a node host on the machine that has Chrome/Brave/Edge/Chromium. The Gateway will proxy browser actions to that node (no separate browser control server required).

Use gateway.nodes.browser.mode to control auto-routing and gateway.nodes.browser.node to pin a specific node if multiple are connected.

Security + remote setup: Browser tool, Remote access, Tailscale, Security