fix(web-search): support self-hosted Firecrawl

This commit is contained in:
Peter Steinberger
2026-05-02 06:34:14 +01:00
parent de0d484236
commit b66459e3c2
6 changed files with 158 additions and 26 deletions

View File

@@ -54,7 +54,7 @@ Notes:
- Choosing Firecrawl in onboarding or `openclaw configure --section web` enables the bundled Firecrawl plugin automatically.
- `web_search` with Firecrawl supports `query` and `count`.
- For Firecrawl-specific controls like `sources`, `categories`, or result scraping, use `firecrawl_search`.
- `baseUrl` overrides must stay on `https://api.firecrawl.dev`.
- `baseUrl` defaults to hosted Firecrawl at `https://api.firecrawl.dev`. Self-hosted overrides are allowed only for private/internal endpoints; HTTP is accepted only for those private targets.
- `FIRECRAWL_BASE_URL` is the shared env fallback for Firecrawl search and scrape base URLs.
## Configure Firecrawl scrape + web_fetch fallback
@@ -85,10 +85,19 @@ Notes:
- Firecrawl fallback attempts run only when an API key is available (`plugins.entries.firecrawl.config.webFetch.apiKey` or `FIRECRAWL_API_KEY`).
- `maxAgeMs` controls how old cached results can be (ms). Default is 2 days.
- Legacy `tools.web.fetch.firecrawl.*` config is auto-migrated by `openclaw doctor --fix`.
- Firecrawl scrape/base URL overrides are restricted to `https://api.firecrawl.dev`.
- Firecrawl scrape/base URL overrides follow the same hosted/private rule as search: public hosted traffic uses `https://api.firecrawl.dev`; self-hosted overrides must resolve to private/internal endpoints.
`firecrawl_scrape` reuses the same `plugins.entries.firecrawl.config.webFetch.*` settings and env vars.
### Self-hosted Firecrawl
Set `plugins.entries.firecrawl.config.webSearch.baseUrl`,
`plugins.entries.firecrawl.config.webFetch.baseUrl`, or `FIRECRAWL_BASE_URL`
when you run Firecrawl yourself. OpenClaw accepts `http://` only for loopback,
private-network, `.local`, `.internal`, or `.localhost` targets. Public custom
hosts are rejected so Firecrawl API keys are not sent to arbitrary endpoints by
accident.
## Firecrawl plugin tools
### `firecrawl_search`

View File

@@ -126,8 +126,9 @@ Legacy `tools.web.fetch.firecrawl.*` config is auto-migrated by `openclaw doctor
</Note>
<Note>
Firecrawl `baseUrl` overrides are locked down: they must use `https://` and
the official Firecrawl host (`api.firecrawl.dev`).
Firecrawl `baseUrl` overrides are locked down: hosted traffic uses
`https://api.firecrawl.dev`; self-hosted overrides must target private or
internal endpoints, and `http://` is accepted only for those private targets.
</Note>
Current runtime behavior: