diff --git a/docs/providers/cerebras.md b/docs/providers/cerebras.md index cdaa4578742..3998dd20166 100644 --- a/docs/providers/cerebras.md +++ b/docs/providers/cerebras.md @@ -6,34 +6,56 @@ read_when: - You need the Cerebras API key env var or CLI auth choice --- -[Cerebras](https://www.cerebras.ai) provides high-speed OpenAI-compatible inference. +[Cerebras](https://www.cerebras.ai) provides high-speed OpenAI-compatible inference on custom inference hardware. OpenClaw includes a bundled Cerebras provider plugin with a static four-model catalog. -| Property | Value | -| -------- | ---------------------------- | -| Provider | `cerebras` | -| Auth | `CEREBRAS_API_KEY` | -| API | OpenAI-compatible | -| Base URL | `https://api.cerebras.ai/v1` | +| Property | Value | +| --------------- | ---------------------------------------- | +| Provider id | `cerebras` | +| Plugin | bundled, `enabledByDefault: true` | +| Auth env var | `CEREBRAS_API_KEY` | +| Onboarding flag | `--auth-choice cerebras-api-key` | +| Direct CLI flag | `--cerebras-api-key ` | +| API | OpenAI-compatible (`openai-completions`) | +| Base URL | `https://api.cerebras.ai/v1` | +| Default model | `cerebras/zai-glm-4.7` | -## Getting Started +## Getting started Create an API key in the [Cerebras Cloud Console](https://cloud.cerebras.ai). - ```bash - openclaw onboard --auth-choice cerebras-api-key - ``` + + +```bash Onboarding +openclaw onboard --auth-choice cerebras-api-key +``` + +```bash Direct flag +openclaw onboard --non-interactive \ + --auth-choice cerebras-api-key \ + --cerebras-api-key "$CEREBRAS_API_KEY" +``` + +```bash Env only +export CEREBRAS_API_KEY=csk-... +``` + + + ```bash openclaw models list --provider cerebras ``` + + The list should include all four bundled models. If `CEREBRAS_API_KEY` is unresolved, `openclaw models status --json` reports the missing credential under `auth.unusableProfiles`. + -### Non-Interactive Setup +## Non-interactive setup ```bash openclaw onboard --non-interactive \ @@ -42,29 +64,28 @@ openclaw onboard --non-interactive \ --cerebras-api-key "$CEREBRAS_API_KEY" ``` -## Built-In Catalog +## Built-in catalog -OpenClaw ships a static Cerebras catalog for the public OpenAI-compatible endpoint: +OpenClaw ships a static Cerebras catalog that mirrors the public OpenAI-compatible endpoint. All four models share a 128k context and 8,192 max-output tokens. -| Model ref | Name | Notes | -| ----------------------------------------- | -------------------- | -------------------------------------- | -| `cerebras/zai-glm-4.7` | Z.ai GLM 4.7 | Default model; preview reasoning model | -| `cerebras/gpt-oss-120b` | GPT OSS 120B | Production reasoning model | -| `cerebras/qwen-3-235b-a22b-instruct-2507` | Qwen 3 235B Instruct | Preview non-reasoning model | -| `cerebras/llama3.1-8b` | Llama 3.1 8B | Production speed-focused model | +| Model ref | Name | Reasoning | Notes | +| ----------------------------------------- | -------------------- | --------- | -------------------------------------- | +| `cerebras/zai-glm-4.7` | Z.ai GLM 4.7 | yes | Default model; preview reasoning model | +| `cerebras/gpt-oss-120b` | GPT OSS 120B | yes | Production reasoning model | +| `cerebras/qwen-3-235b-a22b-instruct-2507` | Qwen 3 235B Instruct | no | Preview non-reasoning model | +| `cerebras/llama3.1-8b` | Llama 3.1 8B | no | Production speed-focused model | -Cerebras marks `zai-glm-4.7` and `qwen-3-235b-a22b-instruct-2507` as preview models, and `llama3.1-8b` / `qwen-3-235b-a22b-instruct-2507` are documented for deprecation on May 27, 2026. Check Cerebras' supported-models page before relying on them for production. + Cerebras marks `zai-glm-4.7` and `qwen-3-235b-a22b-instruct-2507` as preview models, and `llama3.1-8b` plus `qwen-3-235b-a22b-instruct-2507` are documented for deprecation on May 27, 2026. Check Cerebras' supported-models page before relying on them for production workloads. -## Manual Config +## Manual config -The bundled plugin usually means you only need the API key. Use explicit -`models.providers.cerebras` config when you want to override model metadata: +The bundled plugin usually means you only need the API key. Use explicit `models.providers.cerebras` config when you want to override model metadata or run in `mode: "merge"` against the static catalog: ```json5 { - env: { CEREBRAS_API_KEY: "sk-..." }, + env: { CEREBRAS_API_KEY: "csk-..." }, agents: { defaults: { model: { primary: "cerebras/zai-glm-4.7" }, @@ -88,7 +109,22 @@ The bundled plugin usually means you only need the API key. Use explicit ``` -If the Gateway runs as a daemon (launchd/systemd), make sure `CEREBRAS_API_KEY` -is available to that process, for example in `~/.openclaw/.env` or through -`env.shellEnv`. + If the Gateway runs as a daemon (launchd, systemd, Docker), make sure `CEREBRAS_API_KEY` is available to that process — for example in `~/.openclaw/.env` or through `env.shellEnv`. A key sitting only in `~/.profile` will not help a managed service unless the env is imported separately. + +## Related + + + + Choosing providers, model refs, and failover behavior. + + + Reasoning effort levels for the two reasoning-capable Cerebras models. + + + Agent defaults and model configuration. + + + Auth profiles, switching models, and resolving "no profile" errors. + + diff --git a/docs/providers/groq.md b/docs/providers/groq.md index d9d21030de3..7c21d076556 100644 --- a/docs/providers/groq.md +++ b/docs/providers/groq.md @@ -1,20 +1,24 @@ --- -summary: "Groq setup (auth + model selection)" +summary: "Groq setup (auth + model selection + Whisper transcription)" title: "Groq" read_when: - You want to use Groq with OpenClaw - You need the API key env var or CLI auth choice + - You are configuring Whisper audio transcription on Groq --- -[Groq](https://groq.com) provides ultra-fast inference on open-source models -(Llama, Gemma, Mistral, and more) using custom LPU hardware. OpenClaw connects -to Groq through its OpenAI-compatible API. +[Groq](https://groq.com) provides ultra-fast inference on open-weight models (Llama, Gemma, Kimi, Qwen, GPT OSS, and more) using custom LPU hardware. OpenClaw includes a bundled Groq plugin that registers both an OpenAI-compatible chat provider and an audio media-understanding provider. -| Property | Value | -| -------- | ----------------- | -| Provider | `groq` | -| Auth | `GROQ_API_KEY` | -| API | OpenAI-compatible | +| Property | Value | +| ---------------------- | ---------------------------------------- | +| Provider id | `groq` | +| Plugin | bundled, `enabledByDefault: true` | +| Auth env var | `GROQ_API_KEY` | +| Onboarding flag | `--auth-choice groq-api-key` | +| API | OpenAI-compatible (`openai-completions`) | +| Base URL | `https://api.groq.com/openai/v1` | +| Audio transcription | `whisper-large-v3-turbo` (default) | +| Suggested chat default | `groq/llama-3.3-70b-versatile` | ## Getting started @@ -23,9 +27,18 @@ to Groq through its OpenAI-compatible API. Create an API key at [console.groq.com/keys](https://console.groq.com/keys). - ```bash - export GROQ_API_KEY="gsk_..." - ``` + + +```bash Onboarding +openclaw onboard --auth-choice groq-api-key +``` + +```bash Env only +export GROQ_API_KEY=gsk_... +``` + + + ```json5 @@ -38,6 +51,11 @@ to Groq through its OpenAI-compatible API. } ``` + + ```bash + openclaw models list --provider groq + ``` + ### Config file example @@ -55,37 +73,56 @@ to Groq through its OpenAI-compatible API. ## Built-in catalog -OpenClaw ships a manifest-backed Groq catalog for fast provider-filtered model -listing. Run `openclaw models list --all --provider groq` to see the bundled -rows, or check -[console.groq.com/docs/models](https://console.groq.com/docs/models). +OpenClaw ships a manifest-backed Groq catalog with both reasoning and non-reasoning entries. Run `openclaw models list --provider groq` to see the bundled rows for your installed version, or check [console.groq.com/docs/models](https://console.groq.com/docs/models) for Groq's authoritative list. -| Model | Notes | -| --------------------------- | ---------------------------------- | -| **Llama 3.3 70B Versatile** | General-purpose, large context | -| **Llama 3.1 8B Instant** | Fast, lightweight | -| **Gemma 2 9B** | Compact, efficient | -| **Mixtral 8x7B** | MoE architecture, strong reasoning | +| Model ref | Name | Reasoning | Input | Context | +| ---------------------------------------------------- | ----------------------------- | --------- | ------------ | ------- | +| `groq/llama-3.3-70b-versatile` | Llama 3.3 70B Versatile | no | text | 131,072 | +| `groq/llama-3.1-8b-instant` | Llama 3.1 8B Instant | no | text | 131,072 | +| `groq/meta-llama/llama-4-maverick-17b-128e-instruct` | Llama 4 Maverick 17B | no | text + image | 131,072 | +| `groq/meta-llama/llama-4-scout-17b-16e-instruct` | Llama 4 Scout 17B | no | text + image | 131,072 | +| `groq/llama3-70b-8192` | Llama 3 70B | no | text | 8,192 | +| `groq/llama3-8b-8192` | Llama 3 8B | no | text | 8,192 | +| `groq/gemma2-9b-it` | Gemma 2 9B | no | text | 8,192 | +| `groq/mistral-saba-24b` | Mistral Saba 24B | no | text | 32,768 | +| `groq/moonshotai/kimi-k2-instruct` | Kimi K2 Instruct | no | text | 131,072 | +| `groq/moonshotai/kimi-k2-instruct-0905` | Kimi K2 Instruct 0905 | no | text | 262,144 | +| `groq/openai/gpt-oss-120b` | GPT OSS 120B | yes | text | 131,072 | +| `groq/openai/gpt-oss-20b` | GPT OSS 20B | yes | text | 131,072 | +| `groq/openai/gpt-oss-safeguard-20b` | Safety GPT OSS 20B | yes | text | 131,072 | +| `groq/qwen-qwq-32b` | Qwen QwQ 32B | yes | text | 131,072 | +| `groq/qwen/qwen3-32b` | Qwen3 32B | yes | text | 131,072 | +| `groq/deepseek-r1-distill-llama-70b` | DeepSeek R1 Distill Llama 70B | yes | text | 131,072 | +| `groq/groq/compound` | Compound | yes | text | 131,072 | +| `groq/groq/compound-mini` | Compound Mini | yes | text | 131,072 | -Use `openclaw models list --all --provider groq` for the manifest-backed Groq -rows known to this OpenClaw version. + The catalog evolves with each OpenClaw release. `openclaw models list --provider groq` shows the rows known to your installed version; cross-check with [console.groq.com/docs/models](https://console.groq.com/docs/models) for newly-added or deprecated models. ## Reasoning models -OpenClaw maps its shared `/think` levels to Groq's model-specific -`reasoning_effort` values. For `qwen/qwen3-32b`, disabled thinking sends -`none` and enabled thinking sends `default`. For Groq GPT-OSS reasoning models, -OpenClaw sends `low`, `medium`, or `high`; disabled thinking omits -`reasoning_effort` because those models do not support a disabled value. +OpenClaw maps its shared `/think` levels to Groq's model-specific `reasoning_effort` values: + +- For `qwen/qwen3-32b`, disabled thinking sends `none` and enabled thinking sends `default`. +- For Groq GPT OSS reasoning models (`openai/gpt-oss-*`), OpenClaw sends `low`, `medium`, or `high` based on `/think` level. Disabled thinking omits `reasoning_effort` because those models do not support a disabled value. +- DeepSeek R1 Distill, Qwen QwQ, and Compound use Groq's native reasoning surface; `/think` controls visibility but the model always reasons. + +See [Thinking modes](/tools/thinking) for the shared `/think` levels and how OpenClaw translates them per provider. ## Audio transcription -Groq also provides fast Whisper-based audio transcription. When configured as a -media-understanding provider, OpenClaw uses Groq's `whisper-large-v3-turbo` -model to transcribe voice messages through the shared `tools.media.audio` -surface. +Groq's bundled plugin also registers an **audio media-understanding provider** so voice messages can be transcribed through the shared `tools.media.audio` surface. + +| Property | Value | +| ------------------ | ----------------------------------------- | +| Shared config path | `tools.media.audio` | +| Default base URL | `https://api.groq.com/openai/v1` | +| Default model | `whisper-large-v3-turbo` | +| Auto priority | 20 | +| API endpoint | OpenAI-compatible `/audio/transcriptions` | + +To make Groq the default audio backend: ```json5 { @@ -100,42 +137,44 @@ surface. ``` - - | Property | Value | - |----------|-------| - | Shared config path | `tools.media.audio` | - | Default base URL | `https://api.groq.com/openai/v1` | - | Default model | `whisper-large-v3-turbo` | - | API endpoint | OpenAI-compatible `/audio/transcriptions` | - - - - If the Gateway runs as a daemon (launchd/systemd), make sure `GROQ_API_KEY` is - available to that process (for example, in `~/.openclaw/.env` or via - `env.shellEnv`). + + If the Gateway runs as a managed service (launchd, systemd, Docker), `GROQ_API_KEY` must be visible to that process — not just to your interactive shell. - Keys set only in your interactive shell are not visible to daemon-managed - gateway processes. Use `~/.openclaw/.env` or `env.shellEnv` config for - persistent availability. + A key sitting only in `~/.profile` will not help a launchd or systemd daemon unless that environment is imported there too. Set the key in `~/.openclaw/.env` or via `env.shellEnv` to make it readable from the gateway process. + + + OpenClaw accepts any Groq model id at runtime. Use the exact id shown by Groq and prefix it with `groq/`. The bundled catalog covers the common cases; uncatalogued ids fall through to the default OpenAI-compatible template. + + ```json5 + { + agents: { + defaults: { + model: { primary: "groq/" }, + }, + }, + } + ``` + + ## Related - + Choosing providers, model refs, and failover behavior. + + Reasoning effort levels and provider-policy interaction. + Full config schema including provider and audio settings. Groq dashboard, API docs, and pricing. - - Official Groq model catalog. - diff --git a/docs/providers/sglang.md b/docs/providers/sglang.md index 8080873b68f..a0fcb28bce6 100644 --- a/docs/providers/sglang.md +++ b/docs/providers/sglang.md @@ -6,16 +6,21 @@ read_when: title: "SGLang" --- -SGLang can serve open-source models via an **OpenAI-compatible** HTTP API. -OpenClaw can connect to SGLang using the `openai-completions` API. +SGLang serves open-weight models via an OpenAI-compatible HTTP API. OpenClaw connects to SGLang using the `openai-completions` provider family with auto-discovery of available models. -OpenClaw can also **auto-discover** available models from SGLang when you opt -in with `SGLANG_API_KEY` (any value works if your server does not enforce auth) -and you do not define an explicit `models.providers.sglang` entry. +| Property | Value | +| ------------------------- | ------------------------------------------------------------ | +| Provider id | `sglang` | +| Plugin | bundled, `enabledByDefault: true` | +| Auth env var | `SGLANG_API_KEY` (any non-empty value if server has no auth) | +| Onboarding flag | `--auth-choice sglang` | +| API | OpenAI-compatible (`openai-completions`) | +| Default base URL | `http://127.0.0.1:30000/v1` | +| Default model placeholder | `sglang/Qwen/Qwen3-8B` | +| Streaming usage | Yes (`supportsStreamingUsage: true`) | +| Pricing | Marked external-free (`modelPricing.external: false`) | -OpenClaw treats `sglang` as a local OpenAI-compatible provider that supports -streamed usage accounting, so status/context token counts can update from -`stream_options.include_usage` responses. +OpenClaw also **auto-discovers** available models from SGLang when you opt in with `SGLANG_API_KEY` and you do not define an explicit `models.providers.sglang` entry — see [Model discovery (implicit provider)](#model-discovery-implicit-provider) below. ## Getting started