docs(providers): fill undocumented capability gaps (TTS, media understanding, embeddings, xSearch, env vars)

This commit is contained in:
Vincent Koc
2026-04-12 12:06:18 +01:00
parent 3f32aa7582
commit 90fac50987
7 changed files with 225 additions and 0 deletions

View File

@@ -227,6 +227,21 @@ OpenClaw supports Anthropic's prompt caching feature for API-key auth.
</Accordion>
<Accordion title="Media understanding (image and PDF)">
The bundled Anthropic plugin registers image and PDF understanding. OpenClaw
auto-resolves media capabilities from the configured Anthropic auth — no
additional config is needed.
| Property | Value |
| -------------- | -------------------- |
| Default model | `claude-opus-4-6` |
| Supported input | Images, PDF documents |
When an image or PDF is attached to a conversation, OpenClaw automatically
routes it through the Anthropic media understanding provider.
</Accordion>
<Accordion title="1M context window (beta)">
Anthropic's 1M context window is beta-gated. Enable it per model:

View File

@@ -90,6 +90,23 @@ openclaw models auth login --provider github-copilot --method device --set-defau
selects the correct transport based on the model ref.
</Accordion>
<Accordion title="Environment variable resolution order">
OpenClaw resolves Copilot auth from environment variables in the following
priority order:
| Priority | Variable | Notes |
| -------- | --------------------- | -------------------------------- |
| 1 | `COPILOT_GITHUB_TOKEN` | Highest priority, Copilot-specific |
| 2 | `GH_TOKEN` | GitHub CLI token (fallback) |
| 3 | `GITHUB_TOKEN` | Standard GitHub token (lowest) |
When multiple variables are set, OpenClaw uses the highest-priority one.
The device-login flow (`openclaw models auth login-github-copilot`) stores
its token in the auth profile store and takes precedence over all environment
variables.
</Accordion>
<Accordion title="Token storage">
The login stores a GitHub token in the auth profile store and exchanges it
for a Copilot API token when OpenClaw runs. You do not need to manage the

View File

@@ -381,6 +381,30 @@ For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-s
Ollama is free and runs locally, so all model costs are set to $0. This applies to both auto-discovered and manually defined models.
</Accordion>
<Accordion title="Memory embeddings">
The bundled Ollama plugin registers a memory embedding provider for
[memory search](/concepts/memory). It uses the configured Ollama base URL
and API key.
| Property | Value |
| ------------- | ------------------- |
| Default model | `nomic-embed-text` |
| Auto-pull | Yes — the embedding model is pulled automatically if not present locally |
To select Ollama as the memory search embedding provider:
```json5
{
agents: {
defaults: {
memorySearch: { provider: "ollama" },
},
},
}
```
</Accordion>
<Accordion title="Streaming configuration">
OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed.

View File

@@ -237,6 +237,77 @@ OpenClaw adds a small OpenAI-specific prompt overlay for `openai/*` and `openai-
Values are case-insensitive at runtime, so `"Off"` and `"off"` both disable the overlay.
</Tip>
## Voice and speech
<AccordionGroup>
<Accordion title="Speech synthesis (TTS)">
The bundled `openai` plugin registers speech synthesis for the `messages.tts` surface.
| Setting | Config path | Default |
|---------|------------|---------|
| Model | `messages.tts.providers.openai.model` | `gpt-4o-mini-tts` |
| Voice | `messages.tts.providers.openai.voice` | `coral` |
| Speed | `messages.tts.providers.openai.speed` | (unset) |
| Instructions | `messages.tts.providers.openai.instructions` | (unset, `gpt-4o-mini-tts` only) |
| Format | `messages.tts.providers.openai.responseFormat` | `opus` for voice notes, `mp3` for files |
| API key | `messages.tts.providers.openai.apiKey` | Falls back to `OPENAI_API_KEY` |
| Base URL | `messages.tts.providers.openai.baseUrl` | `https://api.openai.com/v1` |
Available models: `gpt-4o-mini-tts`, `tts-1`, `tts-1-hd`. Available voices: `alloy`, `ash`, `ballad`, `cedar`, `coral`, `echo`, `fable`, `juniper`, `marin`, `onyx`, `nova`, `sage`, `shimmer`, `verse`.
```json5
{
messages: {
tts: {
providers: {
openai: { model: "gpt-4o-mini-tts", voice: "coral" },
},
},
},
}
```
<Note>
Set `OPENAI_TTS_BASE_URL` to override the TTS base URL without affecting the chat API endpoint.
</Note>
</Accordion>
<Accordion title="Realtime transcription">
The bundled `openai` plugin registers realtime transcription for the Voice Call plugin.
| Setting | Config path | Default |
|---------|------------|---------|
| Model | `plugins.entries.voice-call.config.streaming.providers.openai.model` | `gpt-4o-transcribe` |
| Silence duration | `...openai.silenceDurationMs` | `800` |
| VAD threshold | `...openai.vadThreshold` | `0.5` |
| API key | `...openai.apiKey` | Falls back to `OPENAI_API_KEY` |
<Note>
Uses a WebSocket connection to `wss://api.openai.com/v1/realtime` with G.711 u-law audio.
</Note>
</Accordion>
<Accordion title="Realtime voice">
The bundled `openai` plugin registers realtime voice for the Voice Call plugin.
| Setting | Config path | Default |
|---------|------------|---------|
| Model | `plugins.entries.voice-call.config.realtime.providers.openai.model` | `gpt-realtime` |
| Voice | `...openai.voice` | `alloy` |
| Temperature | `...openai.temperature` | `0.8` |
| VAD threshold | `...openai.vadThreshold` | `0.5` |
| Silence duration | `...openai.silenceDurationMs` | `500` |
| API key | `...openai.apiKey` | Falls back to `OPENAI_API_KEY` |
<Note>
Supports Azure OpenAI via `azureEndpoint` and `azureDeployment` config keys. Supports bidirectional tool calling. Uses G.711 u-law audio format.
</Note>
</Accordion>
</AccordionGroup>
## Advanced configuration
<AccordionGroup>

View File

@@ -198,6 +198,21 @@ See [Video Generation](/tools/video-generation) for shared tool parameters, prov
## Advanced
<AccordionGroup>
<Accordion title="Image and video understanding">
The bundled Qwen plugin registers media understanding for images and video
on the **Standard** DashScope endpoints (not the Coding Plan endpoints).
| Property | Value |
| ------------- | --------------------- |
| Model | `qwen-vl-max-latest` |
| Supported input | Images, video |
Media understanding is auto-resolved from the configured Qwen auth — no
additional config is needed. Ensure you are using a Standard (pay-as-you-go)
endpoint for media understanding support.
</Accordion>
<Accordion title="Qwen 3.6 Plus availability">
`qwen3.6-plus` is available on the Standard (pay-as-you-go) Model Studio
endpoints:

View File

@@ -132,6 +132,77 @@ Legacy aliases still normalize to the canonical bundled ids:
</Accordion>
<Accordion title="x_search configuration">
The bundled xAI plugin exposes `x_search` as an OpenClaw tool for searching
X (formerly Twitter) content via Grok.
Config path: `plugins.entries.xai.config.xSearch`
| Key | Type | Default | Description |
| ------------------ | ------- | ------------------ | ------------------------------------ |
| `enabled` | boolean | — | Enable or disable x_search |
| `model` | string | `grok-4-1-fast` | Model used for x_search requests |
| `inlineCitations` | boolean | — | Include inline citations in results |
| `maxTurns` | number | — | Maximum conversation turns |
| `timeoutSeconds` | number | — | Request timeout in seconds |
| `cacheTtlMinutes` | number | — | Cache time-to-live in minutes |
```json5
{
plugins: {
entries: {
xai: {
config: {
xSearch: {
enabled: true,
model: "grok-4-1-fast",
inlineCitations: true,
},
},
},
},
},
}
```
</Accordion>
<Accordion title="Code execution configuration">
The bundled xAI plugin exposes `code_execution` as an OpenClaw tool for
remote code execution in xAI's sandbox environment.
Config path: `plugins.entries.xai.config.codeExecution`
| Key | Type | Default | Description |
| ----------------- | ------- | ------------------ | ---------------------------------------- |
| `enabled` | boolean | `true` (if key available) | Enable or disable code execution |
| `model` | string | `grok-4-1-fast` | Model used for code execution requests |
| `maxTurns` | number | — | Maximum conversation turns |
| `timeoutSeconds` | number | — | Request timeout in seconds |
<Note>
This is remote xAI sandbox execution, not local [`exec`](/tools/exec).
</Note>
```json5
{
plugins: {
entries: {
xai: {
config: {
codeExecution: {
enabled: true,
model: "grok-4-1-fast",
},
},
},
},
},
}
```
</Accordion>
<Accordion title="Known limits">
- Auth is API-key only today. There is no xAI OAuth or device-code flow in
OpenClaw yet.

View File

@@ -134,6 +134,18 @@ GLM models are available as `zai/<model>` (example: `zai/glm-5`). The default bu
</Accordion>
<Accordion title="Image understanding">
The bundled Z.AI plugin registers image understanding.
| Property | Value |
| ------------- | ----------- |
| Model | `glm-4.6v` |
Image understanding is auto-resolved from the configured Z.AI auth — no
additional config is needed.
</Accordion>
<Accordion title="Auth details">
- Z.AI uses Bearer auth with your API key.
- The `zai-api-key` onboarding choice auto-detects the matching Z.AI endpoint from the key prefix.