docs: refresh voice-call streaming transcription refs

2026-04-10 08:41:13 +00:00 · 2026-04-04 20:23:28 +01:00
parent 46cb292c2a
commit bbdc429dcb
1 changed files with 67 additions and 0 deletions
--- a/docs/plugins/voice-call.md
+++ b/docs/plugins/voice-call.md
@@ -107,7 +107,16 @@ Set config under `plugins.entries.voice-call.config`:

          streaming: {
            enabled: true,
+            provider: "openai", // optional; first registered realtime transcription provider when unset
            streamPath: "/voice/stream",
+            providers: {
+              openai: {
+                apiKey: "sk-...", // optional if OPENAI_API_KEY is set
+                model: "gpt-4o-transcribe",
+                silenceDurationMs: 800,
+                vadThreshold: 0.5,
+              },
+            },
            preStartTimeoutMs: 5000,
            maxPendingConnections: 32,
            maxPendingConnectionsPerIp: 4,
@@ -138,6 +147,64 @@ Notes:
 - `streaming.maxConnections` caps total open media stream sockets (pending + active).
 - Runtime fallback still accepts those old voice-call keys for now, but the rewrite path is `openclaw doctor --fix` and the compat shim is temporary.

+## Streaming transcription
+
+`streaming` selects a realtime transcription provider for live call audio.
+
+Current runtime behavior:
+
+- `streaming.provider` is optional. If unset, Voice Call uses the first
+  registered realtime transcription provider.
+- Today the bundled provider is OpenAI, registered by the bundled `openai`
+  plugin.
+- Provider-owned raw config lives under `streaming.providers.<providerId>`.
+- If `streaming.provider` points at an unregistered provider, or no realtime
+  transcription provider is registered at all, Voice Call logs a warning and
+  skips media streaming instead of failing the whole plugin.
+
+OpenAI streaming transcription defaults:
+
+- API key: `streaming.providers.openai.apiKey` or `OPENAI_API_KEY`
+- model: `gpt-4o-transcribe`
+- `silenceDurationMs`: `800`
+- `vadThreshold`: `0.5`
+
+Example:
+
+```json5
+{
+  plugins: {
+    entries: {
+      "voice-call": {
+        config: {
+          streaming: {
+            enabled: true,
+            provider: "openai",
+            streamPath: "/voice/stream",
+            providers: {
+              openai: {
+                apiKey: "sk-...", // optional if OPENAI_API_KEY is set
+                model: "gpt-4o-transcribe",
+                silenceDurationMs: 800,
+                vadThreshold: 0.5,
+              },
+            },
+          },
+        },
+      },
+    },
+  },
+}
+```
+
+Legacy keys are still auto-migrated by `openclaw doctor --fix`:
+
+- `streaming.sttProvider` → `streaming.provider`
+- `streaming.openaiApiKey` → `streaming.providers.openai.apiKey`
+- `streaming.sttModel` → `streaming.providers.openai.model`
+- `streaming.silenceDurationMs` → `streaming.providers.openai.silenceDurationMs`
+- `streaming.vadThreshold` → `streaming.providers.openai.vadThreshold`
+
 ## Stale call reaper

 Use `staleCallReaperSeconds` to end calls that never receive a terminal webhook