mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 06:00:43 +00:00
docs: document Talk MLX config
This commit is contained in:
@@ -1326,6 +1326,10 @@ Defaults for Talk mode (macOS/iOS/Android).
|
||||
outputFormat: "mp3_44100_128",
|
||||
apiKey: "elevenlabs_api_key",
|
||||
},
|
||||
mlx: {
|
||||
modelId: "mlx-community/Soprano-80M-bf16",
|
||||
},
|
||||
system: {},
|
||||
},
|
||||
silenceTimeoutMs: 1500,
|
||||
interruptOnSpeech: true,
|
||||
@@ -1339,6 +1343,8 @@ Defaults for Talk mode (macOS/iOS/Android).
|
||||
- `providers.*.apiKey` accepts plaintext strings or SecretRef objects.
|
||||
- `ELEVENLABS_API_KEY` fallback applies only when no Talk API key is configured.
|
||||
- `providers.*.voiceAliases` lets Talk directives use friendly names.
|
||||
- `providers.mlx.modelId` selects the Hugging Face repo used by the macOS local MLX helper. If omitted, macOS uses `mlx-community/Soprano-80M-bf16`.
|
||||
- macOS MLX playback runs through the bundled `openclaw-mlx-tts` helper when present, or an executable on `PATH`; `OPENCLAW_MLX_TTS_BIN` overrides the helper path for development.
|
||||
- `silenceTimeoutMs` controls how long Talk mode waits after user silence before it sends the transcript. Unset keeps the platform default pause window (`700 ms on macOS and Android, 900 ms on iOS`).
|
||||
|
||||
---
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
---
|
||||
summary: "Talk mode: continuous speech conversations with ElevenLabs TTS"
|
||||
summary: "Talk mode: continuous speech conversations with configured TTS providers"
|
||||
read_when:
|
||||
- Implementing Talk mode on macOS/iOS/Android
|
||||
- Changing voice/TTS/interrupt behavior
|
||||
@@ -50,10 +50,19 @@ Supported keys:
|
||||
```json5
|
||||
{
|
||||
talk: {
|
||||
voiceId: "elevenlabs_voice_id",
|
||||
modelId: "eleven_v3",
|
||||
outputFormat: "mp3_44100_128",
|
||||
apiKey: "elevenlabs_api_key",
|
||||
provider: "elevenlabs",
|
||||
providers: {
|
||||
elevenlabs: {
|
||||
voiceId: "elevenlabs_voice_id",
|
||||
modelId: "eleven_v3",
|
||||
outputFormat: "mp3_44100_128",
|
||||
apiKey: "elevenlabs_api_key",
|
||||
},
|
||||
mlx: {
|
||||
modelId: "mlx-community/Soprano-80M-bf16",
|
||||
},
|
||||
system: {},
|
||||
},
|
||||
silenceTimeoutMs: 1500,
|
||||
interruptOnSpeech: true,
|
||||
},
|
||||
@@ -64,9 +73,11 @@ Defaults:
|
||||
|
||||
- `interruptOnSpeech`: true
|
||||
- `silenceTimeoutMs`: when unset, Talk keeps the platform default pause window before sending the transcript (`700 ms on macOS and Android, 900 ms on iOS`)
|
||||
- `voiceId`: falls back to `ELEVENLABS_VOICE_ID` / `SAG_VOICE_ID` (or first ElevenLabs voice when API key is available)
|
||||
- `modelId`: defaults to `eleven_v3` when unset
|
||||
- `apiKey`: falls back to `ELEVENLABS_API_KEY` (or gateway shell profile if available)
|
||||
- `provider`: selects the active Talk provider. Use `elevenlabs`, `mlx`, or `system` for the macOS-local playback paths.
|
||||
- `providers.<provider>.voiceId`: falls back to `ELEVENLABS_VOICE_ID` / `SAG_VOICE_ID` for ElevenLabs (or first ElevenLabs voice when API key is available).
|
||||
- `providers.elevenlabs.modelId`: defaults to `eleven_v3` when unset.
|
||||
- `providers.mlx.modelId`: defaults to `mlx-community/Soprano-80M-bf16` when unset.
|
||||
- `providers.elevenlabs.apiKey`: falls back to `ELEVENLABS_API_KEY` (or gateway shell profile if available).
|
||||
- `outputFormat`: defaults to `pcm_44100` on macOS/iOS and `pcm_24000` on Android (set `mp3_*` to force MP3 streaming)
|
||||
|
||||
## macOS UI
|
||||
@@ -85,6 +96,7 @@ Defaults:
|
||||
- Requires Speech + Microphone permissions.
|
||||
- Uses `chat.send` against session key `main`.
|
||||
- The gateway resolves Talk playback through `talk.speak` using the active Talk provider. Android falls back to local system TTS only when that RPC is unavailable.
|
||||
- macOS local MLX playback uses the bundled `openclaw-mlx-tts` helper when present, or an executable on `PATH`. Set `OPENCLAW_MLX_TTS_BIN` to point at a custom helper binary during development.
|
||||
- `stability` for `eleven_v3` is validated to `0.0`, `0.5`, or `1.0`; other models accept `0..1`.
|
||||
- `latency_tier` is validated to `0..4` when set.
|
||||
- Android supports `pcm_16000`, `pcm_22050`, `pcm_24000`, and `pcm_44100` output formats for low-latency AudioTrack streaming.
|
||||
|
||||
Reference in New Issue
Block a user