openclaw/docs/providers/sglang.md at main

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-06 05:30:42 +00:00

Files

Vincent Koc 58c706451e docs(providers): rewrite Cerebras, Groq, and SGLang with code-verified setup

Cerebras (docs/providers/cerebras.md): rewrote against
extensions/cerebras/openclaw.plugin.json. Added a complete properties
summary, CodeGroup for onboarding/direct-flag/env, a Reasoning column on
the four-model catalog table (Z.ai GLM 4.7 and GPT OSS 120B are
reasoning-capable; Qwen 3 235B and Llama 3.1 8B are not), and a
CardGroup of related links.

Groq (docs/providers/groq.md): expanded the catalog from 4 hand-picked
entries to all 18 bundled models from extensions/groq/openclaw.plugin.json
with model refs, reasoning flags, input modalities, and context windows.
Removed a stale 'Mixtral 8x7B' row that does not exist in the bundled
catalog. Surfaced the audio media-understanding contract (whisper-large-v3-turbo,
auto priority 20) as a properties table and explained the per-model
reasoning_effort mapping for qwen/qwen3-32b vs the GPT OSS reasoning
models. Added an onboarding CodeGroup so the API-key step does not skip
'openclaw onboard --auth-choice groq-api-key'.

SGLang (docs/providers/sglang.md): added a properties summary table at
the top, including the Qwen/Qwen3-8B model placeholder from
extensions/sglang/defaults.ts, the supportsStreamingUsage runtime flag,
and the modelPricing.external: false setting. Clarified that the
onboarding choice id is bare 'sglang' (custom method) rather than the
'-api-key' suffix used by other providers, matching the manifest.

2026-05-05 16:58:01 -07:00

5.0 KiB

Raw Permalink Blame History

summary, read_when, title

summary

read_when

title

Run OpenClaw with SGLang (OpenAI-compatible self-hosted server)

You want to run OpenClaw against a local SGLang server

You want OpenAI-compatible /v1 endpoints with your own models

SGLang

SGLang serves open-weight models via an OpenAI-compatible HTTP API. OpenClaw connects to SGLang using the openai-completions provider family with auto-discovery of available models.

Property	Value
Provider id	`sglang`
Plugin	bundled, `enabledByDefault: true`
Auth env var	`SGLANG_API_KEY` (any non-empty value if server has no auth)
Onboarding flag	`--auth-choice sglang`
API	OpenAI-compatible (`openai-completions`)
Default base URL	`http://127.0.0.1:30000/v1`
Default model placeholder	`sglang/Qwen/Qwen3-8B`
Streaming usage	Yes (`supportsStreamingUsage: true`)
Pricing	Marked external-free (`modelPricing.external: false`)

OpenClaw also auto-discovers available models from SGLang when you opt in with SGLANG_API_KEY and you do not define an explicit models.providers.sglang entry — see Model discovery (implicit provider) below.

Getting started

Launch SGLang with an OpenAI-compatible server. Your base URL should expose `/v1` endpoints (for example `/v1/models`, `/v1/chat/completions`). SGLang commonly runs on:

- `http://127.0.0.1:30000/v1`

Any value works if no auth is configured on your server:

```bash
export SGLANG_API_KEY="sglang-local"
```

```bash openclaw onboard ```

Or configure the model manually:

```json5
{
  agents: {
    defaults: {
      model: { primary: "sglang/your-model-id" },
    },
  },
}
```

Model discovery (implicit provider)

When SGLANG_API_KEY is set (or an auth profile exists) and you do not define models.providers.sglang, OpenClaw will query:

GET http://127.0.0.1:30000/v1/models

and convert the returned IDs into model entries.

If you set `models.providers.sglang` explicitly, auto-discovery is skipped and you must define models manually.

Explicit configuration (manual models)

Use explicit config when:

SGLang runs on a different host/port.
You want to pin contextWindow/maxTokens values.
Your server requires a real API key (or you want to control headers).

{
  models: {
    providers: {
      sglang: {
        baseUrl: "http://127.0.0.1:30000/v1",
        apiKey: "${SGLANG_API_KEY}",
        api: "openai-completions",
        models: [
          {
            id: "your-model-id",
            name: "Local SGLang Model",
            reasoning: false,
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 128000,
            maxTokens: 8192,
          },
        ],
      },
    },
  },
}

Advanced configuration

SGLang is treated as a proxy-style OpenAI-compatible `/v1` backend, not a native OpenAI endpoint.

| Behavior | SGLang |
|----------|--------|
| OpenAI-only request shaping | Not applied |
| `service_tier`, Responses `store`, prompt-cache hints | Not sent |
| Reasoning-compat payload shaping | Not applied |
| Hidden attribution headers (`originator`, `version`, `User-Agent`) | Not injected on custom SGLang base URLs |

**Server not reachable**

Verify the server is running and responding:

```bash
curl http://127.0.0.1:30000/v1/models
```

**Auth errors**

If requests fail with auth errors, set a real `SGLANG_API_KEY` that matches
your server configuration, or configure the provider explicitly under
`models.providers.sglang`.

<Tip>
If you run SGLang without authentication, any non-empty value for
`SGLANG_API_KEY` is sufficient to opt in to model discovery.
</Tip>

Choosing providers, model refs, and failover behavior. Full config schema including provider entries.

5.0 KiB Raw Permalink Blame History

Getting started

Model discovery (implicit provider)

Explicit configuration (manual models)

Advanced configuration

Related

5.0 KiB

Raw Permalink Blame History