From 46cb292c2a8f2aa7dd683ff370c060cbf0401ab5 Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Sat, 4 Apr 2026 20:21:16 +0100 Subject: [PATCH] docs: refresh Firecrawl and web_fetch config refs --- docs/gateway/configuration-reference.md | 4 ++++ docs/help/faq.md | 25 ++++++++++++++----------- docs/reference/api-usage-costs.md | 7 ++++--- docs/tools/firecrawl.md | 10 ++++++++-- docs/tools/web-fetch.md | 21 +++++++++++++++++++++ docs/tools/web.md | 8 ++++++++ 6 files changed, 59 insertions(+), 16 deletions(-) diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md index 4ee94c9f8c7..20b921331e4 100644 --- a/docs/gateway/configuration-reference.md +++ b/docs/gateway/configuration-reference.md @@ -2037,10 +2037,14 @@ Settings can be defined globally in `tools.loopDetection` and overridden per-age }, fetch: { enabled: true, + provider: "firecrawl", // optional; omit for auto-detect maxChars: 50000, maxCharsCap: 50000, + maxResponseBytes: 2000000, timeoutSeconds: 30, cacheTtlMinutes: 15, + maxRedirects: 3, + readability: true, userAgent: "custom-ua", }, }, diff --git a/docs/help/faq.md b/docs/help/faq.md index 339b96ae605..336c352f5c0 100644 --- a/docs/help/faq.md +++ b/docs/help/faq.md @@ -1549,29 +1549,32 @@ for usage/billing and raise limits as needed. }, }, }, - }, - tools: { - web: { - search: { - enabled: true, - provider: "brave", - maxResults: 5, - }, - fetch: { - enabled: true, + }, + tools: { + web: { + search: { + enabled: true, + provider: "brave", + maxResults: 5, + }, + fetch: { + enabled: true, + provider: "firecrawl", // optional; omit for auto-detect + }, }, }, - }, } ``` Provider-specific web-search config now lives under `plugins.entries..config.webSearch.*`. Legacy `tools.web.search.*` provider paths still load temporarily for compatibility, but they should not be used for new configs. + Firecrawl web-fetch fallback config lives under `plugins.entries.firecrawl.config.webFetch.*`. Notes: - If you use allowlists, add `web_search`/`web_fetch`/`x_search` or `group:web`. - `web_fetch` is enabled by default (unless explicitly disabled). + - If `tools.web.fetch.provider` is omitted, OpenClaw auto-detects the first ready fetch fallback provider from available credentials. Today the bundled provider is Firecrawl. - Daemons read env vars from `~/.openclaw/.env` (or the service environment). Docs: [Web tools](/tools/web). diff --git a/docs/reference/api-usage-costs.md b/docs/reference/api-usage-costs.md index a77f89760e1..e2a16412186 100644 --- a/docs/reference/api-usage-costs.md +++ b/docs/reference/api-usage-costs.md @@ -61,8 +61,9 @@ OpenClaw can pick up credentials from: - **Auth profiles** (per-agent, stored in `auth-profiles.json`). - **Environment variables** (e.g. `OPENAI_API_KEY`, `BRAVE_API_KEY`, `FIRECRAWL_API_KEY`). -- **Config** (`models.providers.*.apiKey`, `tools.web.search.*`, `tools.web.fetch.firecrawl.*`, - `memorySearch.*`, `talk.providers.*.apiKey`). +- **Config** (`models.providers.*.apiKey`, `plugins.entries.*.config.webSearch.apiKey`, + `plugins.entries.firecrawl.config.webFetch.apiKey`, `memorySearch.*`, + `talk.providers.*.apiKey`). - **Skills** (`skills.entries..apiKey`) which may export keys to the skill process env. ## Features that can spend keys @@ -149,7 +150,7 @@ See [Web tools](/tools/web). `web_fetch` can call **Firecrawl** when an API key is present: -- `FIRECRAWL_API_KEY` or `tools.web.fetch.firecrawl.apiKey` +- `FIRECRAWL_API_KEY` or `plugins.entries.firecrawl.config.webFetch.apiKey` If Firecrawl isn’t configured, the tool falls back to direct fetch + readability (no paid API). diff --git a/docs/tools/firecrawl.md b/docs/tools/firecrawl.md index aebbe25edb7..040600c1f70 100644 --- a/docs/tools/firecrawl.md +++ b/docs/tools/firecrawl.md @@ -56,6 +56,8 @@ Notes: - Choosing Firecrawl in onboarding or `openclaw configure --section web` enables the bundled Firecrawl plugin automatically. - `web_search` with Firecrawl supports `query` and `count`. - For Firecrawl-specific controls like `sources`, `categories`, or result scraping, use `firecrawl_search`. +- `baseUrl` overrides must stay on `https://api.firecrawl.dev`. +- `FIRECRAWL_BASE_URL` is the shared env fallback for Firecrawl search and scrape base URLs. ## Configure Firecrawl scrape + web_fetch fallback @@ -82,10 +84,10 @@ Notes: Notes: -- `firecrawl.enabled` defaults to `true` unless explicitly set to `false`. - Firecrawl fallback attempts run only when an API key is available (`plugins.entries.firecrawl.config.webFetch.apiKey` or `FIRECRAWL_API_KEY`). - `maxAgeMs` controls how old cached results can be (ms). Default is 2 days. - Legacy `tools.web.fetch.firecrawl.*` config is auto-migrated by `openclaw doctor --fix`. +- Firecrawl scrape/base URL overrides are restricted to `https://api.firecrawl.dev`. `firecrawl_scrape` reuses the same `plugins.entries.firecrawl.config.webFetch.*` settings and env vars. @@ -131,9 +133,13 @@ than basic-only scraping. `web_fetch` extraction order: 1. Readability (local) -2. Firecrawl (if configured) +2. Firecrawl (if selected or auto-detected as the active web-fetch fallback) 3. Basic HTML cleanup (last fallback) +The selection knob is `tools.web.fetch.provider`. If you omit it, OpenClaw +auto-detects the first ready web-fetch provider from available credentials. +Today the bundled provider is Firecrawl. + ## Related - [Web Search overview](/tools/web) -- all providers and auto-detection diff --git a/docs/tools/web-fetch.md b/docs/tools/web-fetch.md index 012880f5409..abb4a943e41 100644 --- a/docs/tools/web-fetch.md +++ b/docs/tools/web-fetch.md @@ -61,6 +61,7 @@ await web_fetch({ url: "https://example.com/article" }); web: { fetch: { enabled: true, // default: true + provider: "firecrawl", // optional; omit for auto-detect maxChars: 50000, // max output chars maxCharsCap: 50000, // hard cap for maxChars param maxResponseBytes: 2000000, // max download size before truncation @@ -82,6 +83,13 @@ If Readability extraction fails, `web_fetch` can fall back to ```json5 { + tools: { + web: { + fetch: { + provider: "firecrawl", // optional; omit for auto-detect from available credentials + }, + }, + }, plugins: { entries: { firecrawl: { @@ -109,6 +117,19 @@ Legacy `tools.web.fetch.firecrawl.*` config is auto-migrated by `openclaw doctor `FIRECRAWL_API_KEY` env fallback, gateway startup fails fast. + + Firecrawl `baseUrl` overrides are locked down: they must use `https://` and + the official Firecrawl host (`api.firecrawl.dev`). + + +Current runtime behavior: + +- `tools.web.fetch.provider` selects the fetch fallback provider explicitly. +- If `provider` is omitted, OpenClaw auto-detects the first ready web-fetch + provider from available credentials. Today the bundled provider is Firecrawl. +- If Readability is disabled, `web_fetch` skips straight to the selected + provider fallback. If no provider is available, it fails closed. + ## Limits and safety - `maxChars` is clamped to `tools.web.fetch.maxCharsCap` diff --git a/docs/tools/web.md b/docs/tools/web.md index b1bc2a6c6b9..36c8f88bf59 100644 --- a/docs/tools/web.md +++ b/docs/tools/web.md @@ -208,6 +208,14 @@ Provider-specific config (API keys, base URLs, modes) lives under `plugins.entries..config.webSearch.*`. See the provider pages for examples. +`web_fetch` fallback provider selection is separate: + +- choose it with `tools.web.fetch.provider` +- or omit that field and let OpenClaw auto-detect the first ready web-fetch + provider from available credentials +- today the bundled web-fetch provider is Firecrawl, configured under + `plugins.entries.firecrawl.config.webFetch.*` + When you choose **Kimi** during `openclaw onboard` or `openclaw configure --section web`, OpenClaw can also ask for: