docs: refresh Firecrawl and web_fetch config refs

This commit is contained in:
Peter Steinberger
2026-04-04 20:21:16 +01:00
parent 33f8ca6cb0
commit 46cb292c2a
6 changed files with 59 additions and 16 deletions

View File

@@ -2037,10 +2037,14 @@ Settings can be defined globally in `tools.loopDetection` and overridden per-age
},
fetch: {
enabled: true,
provider: "firecrawl", // optional; omit for auto-detect
maxChars: 50000,
maxCharsCap: 50000,
maxResponseBytes: 2000000,
timeoutSeconds: 30,
cacheTtlMinutes: 15,
maxRedirects: 3,
readability: true,
userAgent: "custom-ua",
},
},

View File

@@ -1549,29 +1549,32 @@ for usage/billing and raise limits as needed.
},
},
},
},
tools: {
web: {
search: {
enabled: true,
provider: "brave",
maxResults: 5,
},
fetch: {
enabled: true,
},
tools: {
web: {
search: {
enabled: true,
provider: "brave",
maxResults: 5,
},
fetch: {
enabled: true,
provider: "firecrawl", // optional; omit for auto-detect
},
},
},
},
}
```
Provider-specific web-search config now lives under `plugins.entries.<plugin>.config.webSearch.*`.
Legacy `tools.web.search.*` provider paths still load temporarily for compatibility, but they should not be used for new configs.
Firecrawl web-fetch fallback config lives under `plugins.entries.firecrawl.config.webFetch.*`.
Notes:
- If you use allowlists, add `web_search`/`web_fetch`/`x_search` or `group:web`.
- `web_fetch` is enabled by default (unless explicitly disabled).
- If `tools.web.fetch.provider` is omitted, OpenClaw auto-detects the first ready fetch fallback provider from available credentials. Today the bundled provider is Firecrawl.
- Daemons read env vars from `~/.openclaw/.env` (or the service environment).
Docs: [Web tools](/tools/web).

View File

@@ -61,8 +61,9 @@ OpenClaw can pick up credentials from:
- **Auth profiles** (per-agent, stored in `auth-profiles.json`).
- **Environment variables** (e.g. `OPENAI_API_KEY`, `BRAVE_API_KEY`, `FIRECRAWL_API_KEY`).
- **Config** (`models.providers.*.apiKey`, `tools.web.search.*`, `tools.web.fetch.firecrawl.*`,
`memorySearch.*`, `talk.providers.*.apiKey`).
- **Config** (`models.providers.*.apiKey`, `plugins.entries.*.config.webSearch.apiKey`,
`plugins.entries.firecrawl.config.webFetch.apiKey`, `memorySearch.*`,
`talk.providers.*.apiKey`).
- **Skills** (`skills.entries.<name>.apiKey`) which may export keys to the skill process env.
## Features that can spend keys
@@ -149,7 +150,7 @@ See [Web tools](/tools/web).
`web_fetch` can call **Firecrawl** when an API key is present:
- `FIRECRAWL_API_KEY` or `tools.web.fetch.firecrawl.apiKey`
- `FIRECRAWL_API_KEY` or `plugins.entries.firecrawl.config.webFetch.apiKey`
If Firecrawl isnt configured, the tool falls back to direct fetch + readability (no paid API).

View File

@@ -56,6 +56,8 @@ Notes:
- Choosing Firecrawl in onboarding or `openclaw configure --section web` enables the bundled Firecrawl plugin automatically.
- `web_search` with Firecrawl supports `query` and `count`.
- For Firecrawl-specific controls like `sources`, `categories`, or result scraping, use `firecrawl_search`.
- `baseUrl` overrides must stay on `https://api.firecrawl.dev`.
- `FIRECRAWL_BASE_URL` is the shared env fallback for Firecrawl search and scrape base URLs.
## Configure Firecrawl scrape + web_fetch fallback
@@ -82,10 +84,10 @@ Notes:
Notes:
- `firecrawl.enabled` defaults to `true` unless explicitly set to `false`.
- Firecrawl fallback attempts run only when an API key is available (`plugins.entries.firecrawl.config.webFetch.apiKey` or `FIRECRAWL_API_KEY`).
- `maxAgeMs` controls how old cached results can be (ms). Default is 2 days.
- Legacy `tools.web.fetch.firecrawl.*` config is auto-migrated by `openclaw doctor --fix`.
- Firecrawl scrape/base URL overrides are restricted to `https://api.firecrawl.dev`.
`firecrawl_scrape` reuses the same `plugins.entries.firecrawl.config.webFetch.*` settings and env vars.
@@ -131,9 +133,13 @@ than basic-only scraping.
`web_fetch` extraction order:
1. Readability (local)
2. Firecrawl (if configured)
2. Firecrawl (if selected or auto-detected as the active web-fetch fallback)
3. Basic HTML cleanup (last fallback)
The selection knob is `tools.web.fetch.provider`. If you omit it, OpenClaw
auto-detects the first ready web-fetch provider from available credentials.
Today the bundled provider is Firecrawl.
## Related
- [Web Search overview](/tools/web) -- all providers and auto-detection

View File

@@ -61,6 +61,7 @@ await web_fetch({ url: "https://example.com/article" });
web: {
fetch: {
enabled: true, // default: true
provider: "firecrawl", // optional; omit for auto-detect
maxChars: 50000, // max output chars
maxCharsCap: 50000, // hard cap for maxChars param
maxResponseBytes: 2000000, // max download size before truncation
@@ -82,6 +83,13 @@ If Readability extraction fails, `web_fetch` can fall back to
```json5
{
tools: {
web: {
fetch: {
provider: "firecrawl", // optional; omit for auto-detect from available credentials
},
},
},
plugins: {
entries: {
firecrawl: {
@@ -109,6 +117,19 @@ Legacy `tools.web.fetch.firecrawl.*` config is auto-migrated by `openclaw doctor
`FIRECRAWL_API_KEY` env fallback, gateway startup fails fast.
</Note>
<Note>
Firecrawl `baseUrl` overrides are locked down: they must use `https://` and
the official Firecrawl host (`api.firecrawl.dev`).
</Note>
Current runtime behavior:
- `tools.web.fetch.provider` selects the fetch fallback provider explicitly.
- If `provider` is omitted, OpenClaw auto-detects the first ready web-fetch
provider from available credentials. Today the bundled provider is Firecrawl.
- If Readability is disabled, `web_fetch` skips straight to the selected
provider fallback. If no provider is available, it fails closed.
## Limits and safety
- `maxChars` is clamped to `tools.web.fetch.maxCharsCap`

View File

@@ -208,6 +208,14 @@ Provider-specific config (API keys, base URLs, modes) lives under
`plugins.entries.<plugin>.config.webSearch.*`. See the provider pages for
examples.
`web_fetch` fallback provider selection is separate:
- choose it with `tools.web.fetch.provider`
- or omit that field and let OpenClaw auto-detect the first ready web-fetch
provider from available credentials
- today the bundled web-fetch provider is Firecrawl, configured under
`plugins.entries.firecrawl.config.webFetch.*`
When you choose **Kimi** during `openclaw onboard` or
`openclaw configure --section web`, OpenClaw can also ask for: