--- summary: "web_fetch tool -- HTTP fetch with readable content extraction" read_when: - You want to fetch a URL and extract readable content - You need to configure web_fetch or its Firecrawl fallback - You want to understand web_fetch limits and caching title: "Web Fetch" sidebarTitle: "Web Fetch" --- # Web Fetch The `web_fetch` tool does a plain HTTP GET and extracts readable content (HTML to markdown or text). It does **not** execute JavaScript. For JS-heavy sites or login-protected pages, use the [Web Browser](/tools/browser) instead. ## Quick start `web_fetch` is **enabled by default** -- no configuration needed. The agent can call it immediately: ```javascript await web_fetch({ url: "https://example.com/article" }); ``` ## Tool parameters | Parameter | Type | Description | | ------------- | -------- | ---------------------------------------- | | `url` | `string` | URL to fetch (required, http/https only) | | `extractMode` | `string` | `"markdown"` (default) or `"text"` | | `maxChars` | `number` | Truncate output to this many chars | ## How it works Sends an HTTP GET with a Chrome-like User-Agent and `Accept-Language` header. Blocks private/internal hostnames and re-checks redirects. Runs Readability (main-content extraction) on the HTML response. If Readability fails and Firecrawl is configured, retries through the Firecrawl API with bot-circumvention mode. Results are cached for 15 minutes (configurable) to reduce repeated fetches of the same URL. ## Config ```json5 { tools: { web: { fetch: { enabled: true, // default: true maxChars: 50000, // max output chars maxCharsCap: 50000, // hard cap for maxChars param maxResponseBytes: 2000000, // max download size before truncation timeoutSeconds: 30, cacheTtlMinutes: 15, maxRedirects: 3, readability: true, // use Readability extraction userAgent: "Mozilla/5.0 ...", // override User-Agent }, }, }, } ``` ## Firecrawl fallback If Readability extraction fails, `web_fetch` can fall back to [Firecrawl](/tools/firecrawl) for bot-circumvention and better extraction: ```json5 { tools: { web: { fetch: { firecrawl: { enabled: true, apiKey: "fc-...", // optional if FIRECRAWL_API_KEY is set baseUrl: "https://api.firecrawl.dev", onlyMainContent: true, maxAgeMs: 86400000, // cache duration (1 day) timeoutSeconds: 60, }, }, }, }, } ``` `tools.web.fetch.firecrawl.apiKey` supports SecretRef objects. If Firecrawl is enabled and its SecretRef is unresolved with no `FIRECRAWL_API_KEY` env fallback, gateway startup fails fast. ## Limits and safety - `maxChars` is clamped to `tools.web.fetch.maxCharsCap` - Response body is capped at `maxResponseBytes` before parsing; oversized responses are truncated with a warning - Private/internal hostnames are blocked - Redirects are checked and limited by `maxRedirects` - `web_fetch` is best-effort -- some sites need the [Web Browser](/tools/browser) ## Tool profiles If you use tool profiles or allowlists, add `web_fetch` or `group:web`: ```json5 { tools: { allow: ["web_fetch"], // or: allow: ["group:web"] (includes both web_fetch and web_search) }, } ``` ## Related - [Web Search](/tools/web) -- search the web with multiple providers - [Web Browser](/tools/browser) -- full browser automation for JS-heavy sites - [Firecrawl](/tools/firecrawl) -- Firecrawl search and scrape tools