openclaw/docs/tools/firecrawl.md at 4580d585ffa43d8b3fc2b71ead902f07db38f513

mirror of https://github.com/openclaw/openclaw.git synced 2026-03-24 16:32:29 +00:00

Files

Vincent Koc df3f9bb555 docs(tools): add Exa Search page, align all search provider docs

New page: tools/exa-search.md
- Neural/keyword/hybrid search modes with content extraction
- Tool parameters including contents (highlights, text, summary)
- Search mode reference table

Rewritten: tools/duckduckgo-search.md
- Aligned to consistent template (Setup, Config, Tool parameters, Notes, Related)
- Simplified from previous version

Aligned across all providers:
- Every search page now ends with a consistent ## Related section
- Replaced 'See [Web tools]' with proper Related links
- Added Exa + DuckDuckGo to web.md overview CardGroup and comparison table
- Added Exa to docs.json nav and redirects

2026-03-22 21:27:24 -07:00

3.5 KiB

Raw Blame History

summary, read_when, title

summary

read_when

title

Firecrawl search, scrape, and web_fetch fallback

You want Firecrawl-backed web extraction

You need a Firecrawl API key

You want Firecrawl as a web_search provider

You want anti-bot extraction for web_fetch

Firecrawl

OpenClaw can use Firecrawl in three ways:

as the web_search provider
as explicit plugin tools: firecrawl_search and firecrawl_scrape
as a fallback extractor for web_fetch

It is a hosted extraction/search service that supports bot circumvention and caching, which helps with JS-heavy sites or pages that block plain HTTP fetches.

Get an API key

Create a Firecrawl account and generate an API key.
Store it in config or set FIRECRAWL_API_KEY in the gateway environment.

Configure Firecrawl search

{
  tools: {
    web: {
      search: {
        provider: "firecrawl",
      },
    },
  },
  plugins: {
    entries: {
      firecrawl: {
        enabled: true,
        config: {
          webSearch: {
            apiKey: "FIRECRAWL_API_KEY_HERE",
            baseUrl: "https://api.firecrawl.dev",
          },
        },
      },
    },
  },
}

Notes:

Choosing Firecrawl in onboarding or openclaw configure --section web enables the bundled Firecrawl plugin automatically.
web_search with Firecrawl supports query and count.
For Firecrawl-specific controls like sources, categories, or result scraping, use firecrawl_search.

Configure Firecrawl scrape + web_fetch fallback

{
  plugins: {
    entries: {
      firecrawl: {
        enabled: true,
      },
    },
  },
  tools: {
    web: {
      fetch: {
        firecrawl: {
          apiKey: "FIRECRAWL_API_KEY_HERE",
          baseUrl: "https://api.firecrawl.dev",
          onlyMainContent: true,
          maxAgeMs: 172800000,
          timeoutSeconds: 60,
        },
      },
    },
  },
}

Notes:

firecrawl.enabled defaults to true unless explicitly set to false.
Firecrawl fallback attempts run only when an API key is available (tools.web.fetch.firecrawl.apiKey or FIRECRAWL_API_KEY).
maxAgeMs controls how old cached results can be (ms). Default is 2 days.

firecrawl_scrape reuses the same tools.web.fetch.firecrawl.* settings and env vars.

Firecrawl plugin tools

`firecrawl_search`

Use this when you want Firecrawl-specific search controls instead of generic web_search.

Core parameters:

query
count
sources
categories
scrapeResults
timeoutSeconds

`firecrawl_scrape`

Use this for JS-heavy or bot-protected pages where plain web_fetch is weak.

Core parameters:

url
extractMode
maxChars
onlyMainContent
maxAgeMs
proxy
storeInCache
timeoutSeconds

Stealth / bot circumvention

Firecrawl exposes a proxy mode parameter for bot circumvention (basic, stealth, or auto). OpenClaw always uses proxy: "auto" plus storeInCache: true for Firecrawl requests. If proxy is omitted, Firecrawl defaults to auto. auto retries with stealth proxies if a basic attempt fails, which may use more credits than basic-only scraping.

How `web_fetch` uses Firecrawl

web_fetch extraction order:

Readability (local)
Firecrawl (if configured)
Basic HTML cleanup (last fallback)

Web Search overview -- all providers and auto-detection
Web Fetch -- web_fetch tool with Firecrawl fallback
Tavily -- search + extract tools

3.5 KiB Raw Blame History