openclaw/docs/tools/tool-search.md at fb1dfd486bb9aca05055d88c51efe4fbc279a9fc

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-28 22:06:49 +00:00

Files

Peter Steinberger bb46b79d3c refactor: internalize OpenClaw agent runtime (#85341 )

* refactor: extract agent core package

Introduce packages/agent-core as the OpenClaw-owned home for reusable agent loop, harness, session, prompt, and runtime dependency contracts.

* refactor: extract shared llm runtime

Move provider model registries, stream wrappers, OAuth helpers, and LLM utilities into src/llm with plugin-sdk barrels instead of depending on the old embedded runtime layout.

* refactor: remove pi runtime internals

Rename remaining Pi-shaped agent surfaces to OpenClaw agent runtime names, delete obsolete Pi docs and package graph checks, and add the third-party notice for incorporated code.

* refactor: tighten agent session runtime

Make agent-core/runtime dependencies explicit, consolidate compaction and session transcript helpers, and move model/session helpers behind OpenClaw-owned contracts.

* refactor: remove static model and pi auth paths

Drop static model catalogs and Pi auth bridges, move model/provider facts to manifest-owned runtime contracts, and harden internal embedded-agent utilities.

* refactor: remove legacy provider compat paths

* docs: remove agent parity notes

* fix: skip provider wildcard metadata parsing

* refactor: share session extension sdk loading

* refactor: inline acpx proxy error formatter

* refactor: fold edit recovery into edit tool

* fix: accept extension batch separator

* test: align startup provider plugin expectations

* fix: restore provider-scoped release discovery

* test: align static asset packaging expectations

* fix: run static provider catalogs during scoped discovery

* fix: add provider entry catalogs for scoped live discovery

* fix: load lightweight provider catalog entries

* fix: refresh provider-scoped plugin metadata

* fix: keep provider catalog entries on release live path

* fix: keep static manifest models in release live checks

* fix: harden release model discovery

* fix: reduce OpenAI live cache probe reasoning

* fix: disable OpenAI cache probe reasoning

* ci: extend OpenAI gateway live timeout

* fix: extend live gateway model budget

* fix: stabilize release validation regressions

* fix: honor provider aliases in model rows

* fix: stabilize release validation lanes

* fix: stabilize release memory qa

* ci: stabilize release validation lanes

* ci: prefer ipv4 for live docker node calls

* fix: restore shared tool-call stream wrapper

* ci: remove legacy pi test shard alias

* fix: clean up embedded agent test drift

* fix: stabilize runtime alias status

* fix: clean up embedded agent ci drift

* fix: restore release ci invariants

* fix: clean up post-rebase runtime drift

* fix: restore release ci checks

* fix: restore release ci after rebase

* fix: remove stale pi runtime path

* test: align compaction runtime expectations

* test: update plugin prerelease expectations

* fix: handle claude live tool approvals

* fix: stabilize release validation gates

* fix: finish agent runtime import

* test: finish post-rebase agent runtime mocks

* fix: keep codex compaction native

* fix: stabilize codex app-server hook tests

* test: isolate codex diagnostic active run

* test: remove codex diagnostic completion race

# Conflicts:
#	extensions/codex/src/app-server/run-attempt.test.ts

* ci: fix full release manifest performance run id

* refactor: narrow llm plugin sdk boundary

* chore: drop generated google boundary stamps

* fix: repair rebase fallout

* fix: clean up rebased runtime references

* fix: decode codex jwt payloads as base64url

* fix: preserve shipped pi runtime alias

* fix: add scoped sdk virtual modules

* fix: decode llm codex oauth jwt as base64url

* fix: avoid stale vertex adc negative cache

* fix: harden tool arg decoding and codeql path

* fix: keep vertex adc negative checks live

* refactor: consolidate codex jwt and edit helpers

* fix: await codex oauth node runtime imports

* fix: preserve sdk tool and notice contracts

* fix: preserve shipped compat config boundaries

* fix: align codex oauth callback host

* fix: terminate agent-core loop streams on failure

* fix: keep codex oauth callback alive during fallback

* ci: include session tools in critical codeql scans

* fix: keep Cloudflare Anthropic provider auth header

* docs: redirect legacy pi runtime pages

* fix: honor bundled web provider compat discovery

* fix: protect session output spill files

* fix: keep legacy agent dir env blocked

* fix: contain auto-discovered skill symlinks

* fix: harden agent core sdk proxy surfaces

* fix: restore approval reaction sdk compat

* fix: keep live docker runs bounded

* fix: keep codex oauth redirect host aligned

* fix: resolve post-rebase agent runtime drift

* fix: redact anthropic oauth parse failures

* fix: preserve responses strict tool shaping

* fix: repair agent runtime rebase cleanup

* docs: redirect retired parity pages

* fix: bound auto-discovered resources to roots

* fix: repair post-rebase agent test drift

* fix: preserve bundled provider allowlist migration

* fix: preserve manifest-owned provider aliases

* fix: declare photon image dependency

* fix: keep provider headers out of proxy body

* fix: preserve shipped env aliases

* fix: refresh control ui i18n generated state

* fix: quote read fallback paths

* fix: preview edits through configured backend

* test: satisfy core test typecheck

* fix: preserve ZAI usage auth fallback

* test: repair codex diagnostic test

* fix: repair agent runtime rebase drift

* test: finish embedded runner import rename

* fix: repair agent runtime rebase integrations

* test: align compaction oauth fallback expectations

* fix: allow sdk-auth session models

* fix: update doctor tool schema import

* fix: preserve bedrock plugin region

* fix: stream harmony-like prose immediately

* ci: include session runtime in codeql shards

* fix: repair latest rebase integrations

* fix: honor explicit codex websocket transport

* fix: keep openai-compatible credentials provider-scoped

* fix: refresh sdk api baseline after rebase

* fix: route cli runtime aliases through openclaw harness

* test: rename stale harness mock expectation

* test: rename embedded agent overflow calls

* test: clean embedded auth test wording

* test: use openclaw stream types in deepinfra cache test

* fix: refresh sdk api baseline on latest main

* fix: honor bundled discovery compat allowlists

* fix: refresh sdk api baseline after latest rebase

* fix: remove stale rebase imports

* test: rename stale model catalog mock

* test: mock renamed doctor runtime modules

* fix: map canonical kimi env auth

* fix: use internal model registry in bench script

* fix: migrate deepinfra provider catalog entry

* fix: enforce builtin tool suppression

* fix: route compaction auth and proxy payloads safely

* refactor: prune unused llm registry leftovers

* test: update codex hooks session import

* test: fix model picker ci coverage

* test: align model picker auth mock types

2026-05-27 19:24:04 +01:00

8.2 KiB

Raw Blame History

summary, title, read_when

summary

title

read_when

Tool Search: compact large OpenClaw tool catalogs behind search, describe, and call

Tool Search

You want OpenClaw agents to use a large tool catalog without adding every tool schema to the prompt

You want OpenClaw tools, MCP tools, and client tools exposed through one compact runtime surface

You are implementing or debugging tool discovery for OpenClaw runs

Tool Search is an experimental OpenClaw agent runtime feature. It gives agents one compact way to discover and call large tool catalogs. It is useful when the run has many available tools but the model is likely to need only a few of them.

This page documents OpenClaw Tool Search. It is not the Codex-native tool search or dynamic-tools surface. Codex-native code mode, tool search, deferred dynamic tools, and nested tool calls are stable Codex harness surfaces and do not depend on tools.toolSearch.

When enabled for OpenClaw runs, the model receives one tool_search_code tool by default. That tool runs a short JavaScript body in an isolated Node subprocess with an openclaw.tools bridge:

const hits = await openclaw.tools.search("create a GitHub issue");
const tool = await openclaw.tools.describe(hits[0].id);
return await openclaw.tools.call(tool.id, {
  title: "Crash on startup",
  body: "Steps to reproduce...",
});

The catalog can include OpenClaw tools, plugin tools, MCP tools, and client-provided tools. The model does not see every full schema up front. Instead, it searches compact descriptors, describes one selected tool when it needs the exact schema, and calls that tool through OpenClaw.

Codex harness runs do not receive these experimental OpenClaw Tool Search controls. OpenClaw passes product capabilities to Codex as dynamic tools, and Codex owns the stable native code mode, native tool search, deferred dynamic tools, and nested tool calls.

How a turn runs

At planning time the OpenClaw embedded runner builds the effective catalog for the run:

Resolve the active tool policy for the agent, profile, sandbox, and session.
List eligible OpenClaw and plugin tools.
List eligible MCP tools through the session MCP runtime.
Add eligible client tools supplied for the current run.
Index compact descriptors for search.
Expose either the OpenClaw code bridge or the structured fallback tools to the model.

At execution time every real tool call returns to OpenClaw. The isolated Node runtime does not hold plugin implementations, MCP client objects, or secrets. openclaw.tools.call(...) crosses the bridge back into the Gateway, where the normal policy, approval, hook, logging, and result handling still apply.

Modes

tools.toolSearch has two model-facing modes:

code: exposes tool_search_code, the default compact JavaScript bridge.
tools: exposes tool_search, tool_describe, and tool_call as plain structured tools for providers that should not receive code.

Both modes use the same catalog and execution path. The only difference is the shape the model sees. If the current runtime cannot launch the isolated Node code-mode child process, the default code mode falls back to tools before catalog compaction.

Both modes are experimental. Prefer direct tool exposure for small OpenClaw tool catalogs, and prefer the Codex-native stable surfaces for Codex harness runs.

There is no separate source-selection config. When Tool Search is enabled, the catalog includes eligible OpenClaw, MCP, and client tools after normal policy filtering.

Why this exists

Large catalogs are useful but expensive. Sending every tool schema to the model makes the request larger, slows planning, and increases accidental tool selection.

Tool Search changes the shape:

direct tools: the model sees every selected schema before the first token
Tool Search code mode: the model sees one compact code tool and a short API contract
Tool Search tools mode: the model sees three compact structured fallback tools
during the turn: the model loads only the tool schemas it actually needs

Direct tool exposure is still the right default for small catalogs. Tool Search is best when one run can see many tools, especially from MCP servers or client-provided app tools.

API

openclaw.tools.search(query, options?)

Searches the effective catalog for the current run. Results are compact and safe to put back into prompt context.

const hits = await openclaw.tools.search("calendar event", { limit: 5 });

openclaw.tools.describe(id)

Loads full metadata for one search result, including the exact input schema.

const calendarCreate = await openclaw.tools.describe("mcp:calendar:create_event");

openclaw.tools.call(id, args)

Calls a selected tool through OpenClaw.

await openclaw.tools.call(calendarCreate.id, {
  summary: "Planning",
  start: "2026-05-09T14:00:00Z",
});

The structured fallback mode exposes the same operations as tools:

tool_search
tool_describe
tool_call

Runtime boundary

The code bridge runs in a short-lived Node subprocess. The subprocess starts with Node permission mode enabled, an empty environment, no filesystem or network grants, and no child-process or worker grants. OpenClaw enforces a parent-process wall-clock timeout and kills the subprocess on timeout, including after async continuations.

The runtime exposes only:

console.log, console.warn, and console.error
openclaw.tools.search
openclaw.tools.describe
openclaw.tools.call

Normal OpenClaw behavior still applies to final calls:

tool allow and deny policies
per-agent and per-sandbox tool restrictions
channel/runtime tool policy
approval hooks
plugin before_tool_call hooks
session identity, logs, and telemetry

Config

Enable Tool Search for OpenClaw runs with the default code bridge:

openclaw config set tools.toolSearch true

Equivalent JSON:

{
  tools: {
    toolSearch: true,
  },
}

Use the structured fallback tools instead for OpenClaw runs:

{
  tools: {
    toolSearch: {
      mode: "tools",
    },
  },
}

Tune code-mode timeout and search result limits:

{
  tools: {
    toolSearch: {
      mode: "code",
      codeTimeoutMs: 10000,
      searchDefaultLimit: 8,
      maxSearchLimit: 20,
    },
  },
}

Disable it:

{
  tools: {
    toolSearch: false,
  },
}

Prompt and telemetry

Tool Search records enough telemetry to compare it with direct tool exposure:

total serialized tool and prompt bytes sent to the harness
catalog size and source breakdown
search, describe, and call counts
final tool calls executed through OpenClaw
selected tool ids and sources

Session logs should make it possible to answer:

how many tool schemas the model saw up front
how many search and describe operations it performed
which final tool was called
whether the result came from OpenClaw, MCP, or a client tool

E2E validation

The gateway E2E runner proves both paths with the OpenClaw runtime:

node --import tsx scripts/tool-search-gateway-e2e.ts

It creates a temporary fake plugin with a large tool catalog, starts the mock OpenAI provider, starts a Gateway once in direct mode and once with Tool Search enabled, then compares provider request payloads and session logs.

The regression proves:

Direct mode can call the fake plugin tool.
Tool Search can call the same fake plugin tool.
Direct mode exposes the fake plugin tool schemas directly to the provider.
Tool Search exposes only the compact bridge.
The Tool Search request payload is smaller for the large fake catalog.
Session logs show the expected tool-call counts and bridged call telemetry.

Failure behavior

Tool Search should fail closed:

if a tool is not in the effective policy, search should not return it
if a selected tool becomes unavailable, tool_call should fail
if policy or approval blocks execution, the call result should report that block instead of bypassing it
if the code bridge cannot create an isolated runtime, use mode: "tools" or disable Tool Search for that deployment

8.2 KiB Raw Blame History