From e7076617f994319d97f8ed915495377a61666b22 Mon Sep 17 00:00:00 2001 From: Vincent Koc Date: Sun, 12 Apr 2026 11:20:43 +0100 Subject: [PATCH] docs(providers): improve sglang, fal, groq, bedrock-mantle, vllm with Mintlify components --- docs/providers/bedrock-mantle.md | 169 ++++++++++++++++++++--------- docs/providers/fal.md | 157 ++++++++++++++++----------- docs/providers/groq.md | 133 ++++++++++++++--------- docs/providers/sglang.md | 113 ++++++++++++------- docs/providers/vllm.md | 179 ++++++++++++++++++++++++------- 5 files changed, 512 insertions(+), 239 deletions(-) diff --git a/docs/providers/bedrock-mantle.md b/docs/providers/bedrock-mantle.md index f48bdc9026c..45d767411a1 100644 --- a/docs/providers/bedrock-mantle.md +++ b/docs/providers/bedrock-mantle.md @@ -13,55 +13,95 @@ the Mantle OpenAI-compatible endpoint. Mantle hosts open-source and third-party models (GPT-OSS, Qwen, Kimi, GLM, and similar) through a standard `/v1/chat/completions` surface backed by Bedrock infrastructure. -## What OpenClaw supports +| Property | Value | +| -------------- | ----------------------------------------------------------------------------------- | +| Provider ID | `amazon-bedrock-mantle` | +| API | `openai-completions` (OpenAI-compatible) | +| Auth | Explicit `AWS_BEARER_TOKEN_BEDROCK` or IAM credential-chain bearer-token generation | +| Default region | `us-east-1` (override with `AWS_REGION` or `AWS_DEFAULT_REGION`) | -- Provider: `amazon-bedrock-mantle` -- API: `openai-completions` (OpenAI-compatible) -- Auth: explicit `AWS_BEARER_TOKEN_BEDROCK` or IAM credential-chain bearer-token generation -- Region: `AWS_REGION` or `AWS_DEFAULT_REGION` (default: `us-east-1`) +## Getting started + +Choose your preferred auth method and follow the setup steps. + + + + **Best for:** environments where you already have a Mantle bearer token. + + + + ```bash + export AWS_BEARER_TOKEN_BEDROCK="..." + ``` + + Optionally set a region (defaults to `us-east-1`): + + ```bash + export AWS_REGION="us-west-2" + ``` + + + ```bash + openclaw models list + ``` + + Discovered models appear under the `amazon-bedrock-mantle` provider. No + additional config is required unless you want to override defaults. + + + + + + + **Best for:** using AWS SDK-compatible credentials (shared config, SSO, web identity, instance or task roles). + + + + Any AWS SDK-compatible auth source works: + + ```bash + export AWS_PROFILE="default" + export AWS_REGION="us-west-2" + ``` + + + ```bash + openclaw models list + ``` + + OpenClaw generates a Mantle bearer token from the credential chain automatically. + + + + + When `AWS_BEARER_TOKEN_BEDROCK` is not set, OpenClaw mints the bearer token for you from the AWS default credential chain, including shared credentials/config profiles, SSO, web identity, and instance or task roles. + + + + ## Automatic model discovery When `AWS_BEARER_TOKEN_BEDROCK` is set, OpenClaw uses it directly. Otherwise, OpenClaw attempts to generate a Mantle bearer token from the AWS default -credential chain, including shared credentials/config profiles, SSO, web -identity, and instance or task roles. It then discovers available Mantle -models by querying the region's `/v1/models` endpoint. Discovery results are -cached for 1 hour, and IAM-derived bearer tokens are refreshed hourly. +credential chain. It then discovers available Mantle models by querying the +region's `/v1/models` endpoint. -Supported regions: `us-east-1`, `us-east-2`, `us-west-2`, `ap-northeast-1`, +| Behavior | Detail | +| ----------------- | ------------------------- | +| Discovery cache | Results cached for 1 hour | +| IAM token refresh | Hourly | + + +The bearer token is the same `AWS_BEARER_TOKEN_BEDROCK` used by the standard [Amazon Bedrock](/providers/bedrock) provider. + + +### Supported regions + +`us-east-1`, `us-east-2`, `us-west-2`, `ap-northeast-1`, `ap-south-1`, `ap-southeast-3`, `eu-central-1`, `eu-west-1`, `eu-west-2`, `eu-south-1`, `eu-north-1`, `sa-east-1`. -## Onboarding - -1. Choose one auth path on the **gateway host**: - -Explicit bearer token: - -```bash -export AWS_BEARER_TOKEN_BEDROCK="..." -# Optional (defaults to us-east-1): -export AWS_REGION="us-west-2" -``` - -IAM credentials: - -```bash -# Any AWS SDK-compatible auth source works here, for example: -export AWS_PROFILE="default" -export AWS_REGION="us-west-2" -``` - -2. Verify models are discovered: - -```bash -openclaw models list -``` - -Discovered models appear under the `amazon-bedrock-mantle` provider. No -additional config is required unless you want to override defaults. - ## Manual configuration If you prefer explicit config instead of auto-discovery: @@ -92,13 +132,46 @@ If you prefer explicit config instead of auto-discovery: } ``` -## Notes +## Advanced notes -- OpenClaw can mint the Mantle bearer token for you from AWS SDK-compatible - IAM credentials when `AWS_BEARER_TOKEN_BEDROCK` is not set. -- The bearer token is the same `AWS_BEARER_TOKEN_BEDROCK` used by the standard - [Amazon Bedrock](/providers/bedrock) provider. -- Reasoning support is inferred from model IDs containing patterns like - `thinking`, `reasoner`, or `gpt-oss-120b`. -- If the Mantle endpoint is unavailable or returns no models, the provider is - silently skipped. + + + Reasoning support is inferred from model IDs containing patterns like + `thinking`, `reasoner`, or `gpt-oss-120b`. OpenClaw sets `reasoning: true` + automatically for matching models during discovery. + + + + If the Mantle endpoint is unavailable or returns no models, the provider is + silently skipped. OpenClaw does not error; other configured providers + continue to work normally. + + + + Bedrock Mantle is a separate provider from the standard + [Amazon Bedrock](/providers/bedrock) provider. Mantle uses an + OpenAI-compatible `/v1` surface, while the standard Bedrock provider uses + the native Bedrock API. + + Both providers share the same `AWS_BEARER_TOKEN_BEDROCK` credential when + present. + + + + +## Related + + + + Native Bedrock provider for Anthropic Claude, Titan, and other models. + + + Choosing providers, model refs, and failover behavior. + + + Auth details and credential reuse rules. + + + Common issues and how to resolve them. + + diff --git a/docs/providers/fal.md b/docs/providers/fal.md index 1ae888cce2f..b726508a71d 100644 --- a/docs/providers/fal.md +++ b/docs/providers/fal.md @@ -11,42 +11,51 @@ read_when: OpenClaw ships a bundled `fal` provider for hosted image and video generation. -- Provider: `fal` -- Auth: `FAL_KEY` (canonical; `FAL_API_KEY` also works as a fallback) -- API: fal model endpoints +| Property | Value | +| -------- | ------------------------------------------------------------- | +| Provider | `fal` | +| Auth | `FAL_KEY` (canonical; `FAL_API_KEY` also works as a fallback) | +| API | fal model endpoints | -## Quick start +## Getting started -1. Set the API key: - -```bash -openclaw onboard --auth-choice fal-api-key -``` - -2. Set a default image model: - -```json5 -{ - agents: { - defaults: { - imageGenerationModel: { - primary: "fal/fal-ai/flux/dev", + + + ```bash + openclaw onboard --auth-choice fal-api-key + ``` + + + ```json5 + { + agents: { + defaults: { + imageGenerationModel: { + primary: "fal/fal-ai/flux/dev", + }, + }, }, - }, - }, -} -``` + } + ``` + + ## Image generation The bundled `fal` image-generation provider defaults to `fal/fal-ai/flux/dev`. -- Generate: up to 4 images per request -- Edit mode: enabled, 1 reference image -- Supports `size`, `aspectRatio`, and `resolution` -- Current edit caveat: the fal image edit endpoint does **not** support - `aspectRatio` overrides +| Capability | Value | +| -------------- | -------------------------- | +| Max images | 4 per request | +| Edit mode | Enabled, 1 reference image | +| Size overrides | Supported | +| Aspect ratio | Supported | +| Resolution | Supported | + + +The fal image edit endpoint does **not** support `aspectRatio` overrides. + To use fal as the default image provider: @@ -67,46 +76,70 @@ To use fal as the default image provider: The bundled `fal` video-generation provider defaults to `fal/fal-ai/minimax/video-01-live`. -- Modes: text-to-video and single-image reference flows -- Runtime: queue-backed submit/status/result flow for long-running jobs -- HeyGen video-agent model ref: - - `fal/fal-ai/heygen/v2/video-agent` -- Seedance 2.0 model refs: - - `fal/bytedance/seedance-2.0/fast/text-to-video` - - `fal/bytedance/seedance-2.0/fast/image-to-video` - - `fal/bytedance/seedance-2.0/text-to-video` - - `fal/bytedance/seedance-2.0/image-to-video` +| Capability | Value | +| ---------- | ------------------------------------------------------------ | +| Modes | Text-to-video, single-image reference | +| Runtime | Queue-backed submit/status/result flow for long-running jobs | -To use Seedance 2.0 as the default video model: + + + **HeyGen video-agent:** -```json5 -{ - agents: { - defaults: { - videoGenerationModel: { - primary: "fal/bytedance/seedance-2.0/fast/text-to-video", + - `fal/fal-ai/heygen/v2/video-agent` + + **Seedance 2.0:** + + - `fal/bytedance/seedance-2.0/fast/text-to-video` + - `fal/bytedance/seedance-2.0/fast/image-to-video` + - `fal/bytedance/seedance-2.0/text-to-video` + - `fal/bytedance/seedance-2.0/image-to-video` + + + + + ```json5 + { + agents: { + defaults: { + videoGenerationModel: { + primary: "fal/bytedance/seedance-2.0/fast/text-to-video", + }, + }, }, - }, - }, -} -``` + } + ``` + -To use HeyGen video-agent as the default video model: - -```json5 -{ - agents: { - defaults: { - videoGenerationModel: { - primary: "fal/fal-ai/heygen/v2/video-agent", + + ```json5 + { + agents: { + defaults: { + videoGenerationModel: { + primary: "fal/fal-ai/heygen/v2/video-agent", + }, + }, }, - }, - }, -} -``` + } + ``` + + + + +Use `openclaw models list --provider fal` to see the full list of available fal +models, including any recently added entries. + ## Related -- [Image Generation](/tools/image-generation) -- [Video Generation](/tools/video-generation) -- [Configuration Reference](/gateway/configuration-reference#agent-defaults) + + + Shared image tool parameters and provider selection. + + + Shared video tool parameters and provider selection. + + + Agent defaults including image and video model selection. + + diff --git a/docs/providers/groq.md b/docs/providers/groq.md index ae2e34cd784..5f5b449d99c 100644 --- a/docs/providers/groq.md +++ b/docs/providers/groq.md @@ -12,33 +12,37 @@ read_when: (Llama, Gemma, Mistral, and more) using custom LPU hardware. OpenClaw connects to Groq through its OpenAI-compatible API. -- Provider: `groq` -- Auth: `GROQ_API_KEY` -- API: OpenAI-compatible +| Property | Value | +| -------- | ----------------- | +| Provider | `groq` | +| Auth | `GROQ_API_KEY` | +| API | OpenAI-compatible | -## Quick start +## Getting started -1. Get an API key from [console.groq.com/keys](https://console.groq.com/keys). + + + Create an API key at [console.groq.com/keys](https://console.groq.com/keys). + + + ```bash + export GROQ_API_KEY="gsk_..." + ``` + + + ```json5 + { + agents: { + defaults: { + model: { primary: "groq/llama-3.3-70b-versatile" }, + }, + }, + } + ``` + + -2. Set the API key: - -```bash -export GROQ_API_KEY="gsk_..." -``` - -3. Set a default model: - -```json5 -{ - agents: { - defaults: { - model: { primary: "groq/llama-3.3-70b-versatile" }, - }, - }, -} -``` - -## Config file example +### Config file example ```json5 { @@ -51,6 +55,24 @@ export GROQ_API_KEY="gsk_..." } ``` +## Available models + +Groq's model catalog changes frequently. Run `openclaw models list | grep groq` +to see currently available models, or check +[console.groq.com/docs/models](https://console.groq.com/docs/models). + +| Model | Notes | +| --------------------------- | ---------------------------------- | +| **Llama 3.3 70B Versatile** | General-purpose, large context | +| **Llama 3.1 8B Instant** | Fast, lightweight | +| **Gemma 2 9B** | Compact, efficient | +| **Mixtral 8x7B** | MoE architecture, strong reasoning | + + +Use `openclaw models list --provider groq` for the most up-to-date list of +models available on your account. + + ## Audio transcription Groq also provides fast Whisper-based audio transcription. When configured as a @@ -70,36 +92,43 @@ surface. } ``` -## Environment note + + + | Property | Value | + |----------|-------| + | Shared config path | `tools.media.audio` | + | Default base URL | `https://api.groq.com/openai/v1` | + | Default model | `whisper-large-v3-turbo` | + | API endpoint | OpenAI-compatible `/audio/transcriptions` | + -If the Gateway runs as a daemon (launchd/systemd), make sure `GROQ_API_KEY` is -available to that process (for example, in `~/.openclaw/.env` or via -`env.shellEnv`). + + If the Gateway runs as a daemon (launchd/systemd), make sure `GROQ_API_KEY` is + available to that process (for example, in `~/.openclaw/.env` or via + `env.shellEnv`). -## Audio notes + + Keys set only in your interactive shell are not visible to daemon-managed + gateway processes. Use `~/.openclaw/.env` or `env.shellEnv` config for + persistent availability. + -- Shared config path: `tools.media.audio` -- Default Groq audio base URL: `https://api.groq.com/openai/v1` -- Default Groq audio model: `whisper-large-v3-turbo` -- Groq audio transcription uses the OpenAI-compatible `/audio/transcriptions` - path + + -## Available models +## Related -Groq's model catalog changes frequently. Run `openclaw models list | grep groq` -to see currently available models, or check -[console.groq.com/docs/models](https://console.groq.com/docs/models). - -Popular choices include: - -- **Llama 3.3 70B Versatile** - general-purpose, large context -- **Llama 3.1 8B Instant** - fast, lightweight -- **Gemma 2 9B** - compact, efficient -- **Mixtral 8x7B** - MoE architecture, strong reasoning - -## Links - -- [Groq Console](https://console.groq.com) -- [API Documentation](https://console.groq.com/docs) -- [Model List](https://console.groq.com/docs/models) -- [Pricing](https://groq.com/pricing) + + + Choosing providers, model refs, and failover behavior. + + + Full config schema including provider and audio settings. + + + Groq dashboard, API docs, and pricing. + + + Official Groq model catalog. + + diff --git a/docs/providers/sglang.md b/docs/providers/sglang.md index ff2a9ded9e4..25c88ee74d0 100644 --- a/docs/providers/sglang.md +++ b/docs/providers/sglang.md @@ -15,36 +15,44 @@ OpenClaw can also **auto-discover** available models from SGLang when you opt in with `SGLANG_API_KEY` (any value works if your server does not enforce auth) and you do not define an explicit `models.providers.sglang` entry. -## Quick start +## Getting started -1. Start SGLang with an OpenAI-compatible server. + + + Launch SGLang with an OpenAI-compatible server. Your base URL should expose + `/v1` endpoints (for example `/v1/models`, `/v1/chat/completions`). SGLang + commonly runs on: -Your base URL should expose `/v1` endpoints (for example `/v1/models`, -`/v1/chat/completions`). SGLang commonly runs on: + - `http://127.0.0.1:30000/v1` -- `http://127.0.0.1:30000/v1` + + + Any value works if no auth is configured on your server: -2. Opt in (any value works if no auth is configured): + ```bash + export SGLANG_API_KEY="sglang-local" + ``` -```bash -export SGLANG_API_KEY="sglang-local" -``` + + + ```bash + openclaw onboard + ``` -3. Run onboarding and choose `SGLang`, or set a model directly: + Or configure the model manually: -```bash -openclaw onboard -``` + ```json5 + { + agents: { + defaults: { + model: { primary: "sglang/your-model-id" }, + }, + }, + } + ``` -```json5 -{ - agents: { - defaults: { - model: { primary: "sglang/your-model-id" }, - }, - }, -} -``` + + ## Model discovery (implicit provider) @@ -55,8 +63,10 @@ define `models.providers.sglang`, OpenClaw will query: and convert the returned IDs into model entries. + If you set `models.providers.sglang` explicitly, auto-discovery is skipped and you must define models manually. + ## Explicit configuration (manual models) @@ -91,25 +101,52 @@ Use explicit config when: } ``` -## Troubleshooting +## Advanced configuration -- Check the server is reachable: + + + SGLang is treated as a proxy-style OpenAI-compatible `/v1` backend, not a + native OpenAI endpoint. -```bash -curl http://127.0.0.1:30000/v1/models -``` + | Behavior | SGLang | + |----------|--------| + | OpenAI-only request shaping | Not applied | + | `service_tier`, Responses `store`, prompt-cache hints | Not sent | + | Reasoning-compat payload shaping | Not applied | + | Hidden attribution headers (`originator`, `version`, `User-Agent`) | Not injected on custom SGLang base URLs | -- If requests fail with auth errors, set a real `SGLANG_API_KEY` that matches - your server configuration, or configure the provider explicitly under - `models.providers.sglang`. + -## Proxy-style behavior + + **Server not reachable** -SGLang is treated as a proxy-style OpenAI-compatible `/v1` backend, not a -native OpenAI endpoint. + Verify the server is running and responding: -- native OpenAI-only request shaping does not apply here -- no `service_tier`, no Responses `store`, no prompt-cache hints, and no - OpenAI reasoning-compat payload shaping -- hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) - are not injected on custom SGLang base URLs + ```bash + curl http://127.0.0.1:30000/v1/models + ``` + + **Auth errors** + + If requests fail with auth errors, set a real `SGLANG_API_KEY` that matches + your server configuration, or configure the provider explicitly under + `models.providers.sglang`. + + + If you run SGLang without authentication, any non-empty value for + `SGLANG_API_KEY` is sufficient to opt in to model discovery. + + + + + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Full config schema including provider entries. + + diff --git a/docs/providers/vllm.md b/docs/providers/vllm.md index 99cbadb511c..d31b53f8c28 100644 --- a/docs/providers/vllm.md +++ b/docs/providers/vllm.md @@ -8,53 +8,78 @@ title: "vLLM" # vLLM -vLLM can serve open-source (and some custom) models via an **OpenAI-compatible** HTTP API. OpenClaw can connect to vLLM using the `openai-completions` API. +vLLM can serve open-source (and some custom) models via an **OpenAI-compatible** HTTP API. OpenClaw connects to vLLM using the `openai-completions` API. -OpenClaw can also **auto-discover** available models from vLLM when you opt in with `VLLM_API_KEY` (any value works if your server doesn’t enforce auth) and you do not define an explicit `models.providers.vllm` entry. +OpenClaw can also **auto-discover** available models from vLLM when you opt in with `VLLM_API_KEY` (any value works if your server does not enforce auth) and you do not define an explicit `models.providers.vllm` entry. -## Quick start +| Property | Value | +| ---------------- | ---------------------------------------- | +| Provider ID | `vllm` | +| API | `openai-completions` (OpenAI-compatible) | +| Auth | `VLLM_API_KEY` environment variable | +| Default base URL | `http://127.0.0.1:8000/v1` | -1. Start vLLM with an OpenAI-compatible server. +## Getting started -Your base URL should expose `/v1` endpoints (e.g. `/v1/models`, `/v1/chat/completions`). vLLM commonly runs on: + + + Your base URL should expose `/v1` endpoints (e.g. `/v1/models`, `/v1/chat/completions`). vLLM commonly runs on: -- `http://127.0.0.1:8000/v1` + ``` + http://127.0.0.1:8000/v1 + ``` -2. Opt in (any value works if no auth is configured): + + + Any value works if your server does not enforce auth: -```bash -export VLLM_API_KEY="vllm-local" -``` + ```bash + export VLLM_API_KEY="vllm-local" + ``` -3. Select a model (replace with one of your vLLM model IDs): + + + Replace with one of your vLLM model IDs: -```json5 -{ - agents: { - defaults: { - model: { primary: "vllm/your-model-id" }, - }, - }, -} -``` + ```json5 + { + agents: { + defaults: { + model: { primary: "vllm/your-model-id" }, + }, + }, + } + ``` + + + + ```bash + openclaw models list --provider vllm + ``` + + ## Model discovery (implicit provider) -When `VLLM_API_KEY` is set (or an auth profile exists) and you **do not** define `models.providers.vllm`, OpenClaw will query: +When `VLLM_API_KEY` is set (or an auth profile exists) and you **do not** define `models.providers.vllm`, OpenClaw queries: -- `GET http://127.0.0.1:8000/v1/models` +``` +GET http://127.0.0.1:8000/v1/models +``` -…and convert the returned IDs into model entries. +and converts the returned IDs into model entries. + If you set `models.providers.vllm` explicitly, auto-discovery is skipped and you must define models manually. + ## Explicit configuration (manual models) Use explicit config when: -- vLLM runs on a different host/port. -- You want to pin `contextWindow`/`maxTokens` values. -- Your server requires a real API key (or you want to control headers). +- vLLM runs on a different host or port +- You want to pin `contextWindow` or `maxTokens` values +- Your server requires a real API key (or you want to control headers) ```json5 { @@ -81,23 +106,99 @@ Use explicit config when: } ``` +## Advanced notes + + + + vLLM is treated as a proxy-style OpenAI-compatible `/v1` backend, not a native + OpenAI endpoint. This means: + + | Behavior | Applied? | + |----------|----------| + | Native OpenAI request shaping | No | + | `service_tier` | Not sent | + | Responses `store` | Not sent | + | Prompt-cache hints | Not sent | + | OpenAI reasoning-compat payload shaping | Not applied | + | Hidden OpenClaw attribution headers | Not injected on custom base URLs | + + + + + If your vLLM server runs on a non-default host or port, set `baseUrl` in the explicit provider config: + + ```json5 + { + models: { + providers: { + vllm: { + baseUrl: "http://192.168.1.50:9000/v1", + apiKey: "${VLLM_API_KEY}", + api: "openai-completions", + models: [ + { + id: "my-custom-model", + name: "Remote vLLM Model", + reasoning: false, + input: ["text"], + contextWindow: 64000, + maxTokens: 4096, + }, + ], + }, + }, + }, + } + ``` + + + + ## Troubleshooting -- Check the server is reachable: + + + Check that the vLLM server is running and accessible: -```bash -curl http://127.0.0.1:8000/v1/models -``` + ```bash + curl http://127.0.0.1:8000/v1/models + ``` -- If requests fail with auth errors, set a real `VLLM_API_KEY` that matches your server configuration, or configure the provider explicitly under `models.providers.vllm`. + If you see a connection error, verify the host, port, and that vLLM started with the OpenAI-compatible server mode. -## Proxy-style behavior + -vLLM is treated as a proxy-style OpenAI-compatible `/v1` backend, not a native -OpenAI endpoint. + + If requests fail with auth errors, set a real `VLLM_API_KEY` that matches your server configuration, or configure the provider explicitly under `models.providers.vllm`. -- native OpenAI-only request shaping does not apply here -- no `service_tier`, no Responses `store`, no prompt-cache hints, and no - OpenAI reasoning-compat payload shaping -- hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`) - are not injected on custom vLLM base URLs + + If your vLLM server does not enforce auth, any non-empty value for `VLLM_API_KEY` works as an opt-in signal for OpenClaw. + + + + + + Auto-discovery requires `VLLM_API_KEY` to be set **and** no explicit `models.providers.vllm` config entry. If you have defined the provider manually, OpenClaw skips discovery and uses only your declared models. + + + + +More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq). + + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Native OpenAI provider and OpenAI-compatible route behavior. + + + Auth details and credential reuse rules. + + + Common issues and how to resolve them. + +