diff --git a/CHANGELOG.md b/CHANGELOG.md index ec683bc8a3a..007e484cdad 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -103,6 +103,8 @@ Docs: https://docs.openclaw.ai - WhatsApp/groups+direct: setting `systemPrompt: ""` on a specific `groups.` or `direct.` entry now suppresses the wildcard system prompt instead of falling through to it, so users can silence the global prompt for a specific group or peer. (#70381) Thanks @Bluetegu. - Browser/tool: tell agents not to pass per-call `timeoutMs` on existing-session type, evaluate, and other Chrome MCP actions that reject timeout overrides. Thanks @steipete. - Browser/tool: use Playwright's current AI aria snapshot API for `refs="aria"` and fall back to role refs when a node browser cannot provide aria refs, so agents can still inspect and click controls such as Google Meet admission buttons. Thanks @steipete. +- Browser/tool: expose stable `tabId` handles such as `t1` plus optional tab labels, and accept those handles anywhere a browser tab target is needed. Thanks @steipete. +- Browser/tool: bundle a `browser-automation` skill with the multi-step snapshot, stable-tab, stale-ref, and manual-blocker loop for agent-controlled pages. Thanks @steipete. - Plugins/Google Meet: use browser automation to classify and clear Meet's microphone-choice interstitial during browser meeting creation, and reuse in-progress create tabs on retry instead of opening duplicates. Thanks @steipete. - Codex/GPT-5.4: harden fallback, auth-profile, tool-schema, and replay edge cases across native and embedded runtime paths. (#70743) Thanks @100yenadmin. - Models/fallback: resolve bare fallback model provider ids before model switching, so configured fallback chains keep working when a fallback is named without an explicit provider prefix. Thanks @steipete. diff --git a/docs/cli/browser.md b/docs/cli/browser.md index 3ca9ea0e164..2ae7052f12b 100644 --- a/docs/cli/browser.md +++ b/docs/cli/browser.md @@ -111,14 +111,20 @@ openclaw browser --browser-profile work tabs ```bash openclaw browser tabs -openclaw browser tab new +openclaw browser tab new --label docs +openclaw browser tab label t1 docs openclaw browser tab select 2 openclaw browser tab close 2 -openclaw browser open https://docs.openclaw.ai -openclaw browser focus -openclaw browser close +openclaw browser open https://docs.openclaw.ai --label docs +openclaw browser focus docs +openclaw browser close t1 ``` +`tabs` returns the raw `targetId` plus a stable `tabId` such as `t1`. You can +also assign a label with `open --label`, `tab new --label`, or `tab label`. +`focus`, `close`, snapshots, and actions accept the raw `targetId`, `tabId`, +label, or a unique target-id prefix. + ## Snapshot / screenshot / actions Snapshot: diff --git a/docs/concepts/system-prompt.md b/docs/concepts/system-prompt.md index 7ef9e72923f..e6c261713da 100644 --- a/docs/concepts/system-prompt.md +++ b/docs/concepts/system-prompt.md @@ -171,6 +171,10 @@ Eligibility includes skill metadata gates, runtime environment/config checks, and the effective agent skill allowlist when `agents.defaults.skills` or `agents.list[].skills` is configured. +Plugin-bundled skills are eligible only when their owning plugin is enabled. +This lets tool plugins expose deeper operating guides without embedding all of +that guidance directly in every tool description. + ``` diff --git a/docs/tools/browser.md b/docs/tools/browser.md index 29b737d5160..3c6522ced06 100644 --- a/docs/tools/browser.md +++ b/docs/tools/browser.md @@ -23,6 +23,9 @@ Beginner view: - A separate browser profile named **openclaw** (orange accent by default). - Deterministic tab control (list/open/focus/close). - Agent actions (click/type/drag/select), snapshots, screenshots, PDFs. +- A bundled `browser-automation` skill that teaches agents the snapshot, + stable-tab, stale-ref, and manual-blocker recovery loop when the browser + plugin is enabled. - Optional multi-profile support (`openclaw`, `work`, `remote`, ...). This browser is **not** your daily driver. It is a safe, isolated surface for @@ -63,6 +66,22 @@ Defaults need both `plugins.entries.browser.enabled` **and** `browser.enabled=tr Browser config changes require a Gateway restart so the plugin can re-register its service. +## Agent guidance + +The browser plugin ships two levels of agent guidance: + +- The `browser` tool description carries the compact always-on contract: pick + the right profile, keep refs on the same tab, use `tabId`/labels for tab + targeting, and load the browser skill for multi-step work. +- The bundled `browser-automation` skill carries the longer operating loop: + check status/tabs first, label task tabs, snapshot before acting, resnapshot + after UI changes, recover stale refs once, and report login/2FA/captcha or + camera/microphone blockers as manual action instead of guessing. + +Plugin-bundled skills are listed in the agent's available skills when the +plugin is enabled. The full skill instructions are loaded on demand, so routine +turns do not pay the full token cost. + ## Missing browser command or tool If `openclaw browser` is unknown after an upgrade, `browser.request` is missing, or the agent reports the browser tool as unavailable, the usual cause is a `plugins.allow` list that omits `browser`. Add it: @@ -494,7 +513,8 @@ Compared to the managed `openclaw` profile, existing-session drivers are more co - **Dedicated user data dir**: never touches your personal browser profile. - **Dedicated ports**: avoids `9222` to prevent collisions with dev workflows. -- **Deterministic tab control**: target tabs by `targetId`, not “last tab”. +- **Deterministic tab control**: target tabs by raw `targetId`, stable `tabId` + handles such as `t1`, or labels you assign with `open --label` / `tab label`. ## Browser selection diff --git a/docs/tools/skills.md b/docs/tools/skills.md index 9df8495d08c..867a833831b 100644 --- a/docs/tools/skills.md +++ b/docs/tools/skills.md @@ -81,9 +81,13 @@ slash-command discovery, sandbox sync, and skill snapshots. Plugins can ship their own skills by listing `skills` directories in `openclaw.plugin.json` (paths relative to the plugin root). Plugin skills load -when the plugin is enabled. Today those directories are merged into the same -low-precedence path as `skills.load.extraDirs`, so a same-named bundled, -managed, agent, or workspace skill overrides them. +when the plugin is enabled. This is the right place for tool-specific operating +guides that are too long for the tool description but should be available +whenever the plugin is installed; for example, the browser plugin ships a +`browser-automation` skill for multi-step browser control. Today those +directories are merged into the same low-precedence path as +`skills.load.extraDirs`, so a same-named bundled, managed, agent, or workspace +skill overrides them. You can gate them via `metadata.openclaw.requires.config` on the plugin’s config entry. See [Plugins](/tools/plugin) for discovery/config and [Tools](/tools) for the tool surface those skills teach.