docs: document google meet elevenlabs voice setup

This commit is contained in:
Peter Steinberger
2026-05-04 07:16:39 +01:00
parent 02f455fda3
commit 70850d15ee
2 changed files with 60 additions and 10 deletions

View File

@@ -1131,6 +1131,50 @@ Optional overrides:
}
```
ElevenLabs for both agent-mode listening and speaking:
```json5
{
messages: {
tts: {
provider: "elevenlabs",
providers: {
elevenlabs: {
modelId: "eleven_v3",
voiceId: "pMsXgVXv3BLzUgSXRplE",
},
},
},
},
plugins: {
entries: {
"google-meet": {
config: {
realtime: {
transcriptionProvider: "elevenlabs",
providers: {
elevenlabs: {
modelId: "scribe_v2_realtime",
audioFormat: "ulaw_8000",
sampleRate: 8000,
commitStrategy: "vad",
},
},
},
},
},
},
},
}
```
The persistent Meet voice comes from
`messages.tts.providers.elevenlabs.voiceId`. Agent replies can also use
per-reply `[[tts:voiceId=... model=eleven_v3]]` directives when TTS model
overrides are enabled, but config is the deterministic default for meetings.
On join, the logs should show `transcriptionProvider=elevenlabs` and each
spoken reply should log `provider=elevenlabs model=eleven_v3 voice=<voiceId>`.
Twilio-only config:
```json5

View File

@@ -3,18 +3,18 @@ summary: "Use ElevenLabs speech, Scribe STT, and realtime transcription with Ope
read_when:
- You want ElevenLabs text-to-speech in OpenClaw
- You want ElevenLabs Scribe speech-to-text for audio attachments
- You want ElevenLabs realtime transcription for Voice Call
- You want ElevenLabs realtime transcription for Voice Call or Google Meet
title: "ElevenLabs"
---
OpenClaw uses ElevenLabs for text-to-speech, batch speech-to-text with Scribe
v2, and Voice Call streaming STT with Scribe v2 Realtime.
v2, and streaming STT with Scribe v2 Realtime.
| Capability | OpenClaw surface | Default |
| ------------------------ | --------------------------------------------- | ------------------------ |
| Text-to-speech | `messages.tts` / `talk` | `eleven_multilingual_v2` |
| Batch speech-to-text | `tools.media.audio` | `scribe_v2` |
| Streaming speech-to-text | Voice Call `streaming.provider: "elevenlabs"` | `scribe_v2_realtime` |
| Capability | OpenClaw surface | Default |
| ------------------------ | -------------------------------------------------------------------- | ------------------------ |
| Text-to-speech | `messages.tts` / `talk` | `eleven_multilingual_v2` |
| Batch speech-to-text | `tools.media.audio` | `scribe_v2` |
| Streaming speech-to-text | Voice Call streaming or Google Meet `realtime.transcriptionProvider` | `scribe_v2_realtime` |
## Authentication
@@ -66,10 +66,10 @@ Use Scribe v2 for inbound audio attachments and short recorded voice segments:
OpenClaw sends multipart audio to ElevenLabs `/v1/speech-to-text` with
`model_id: "scribe_v2"`. Language hints map to `language_code` when present.
## Voice Call streaming STT
## Streaming STT
The bundled `elevenlabs` plugin registers Scribe v2 Realtime for Voice Call
streaming transcription.
The bundled `elevenlabs` plugin registers Scribe v2 Realtime for Voice Call and
Google Meet agent-mode streaming transcription.
| Setting | Config path | Default |
| --------------- | ------------------------------------------------------------------------- | ------------------------------------------------- |
@@ -111,7 +111,13 @@ provider defaults to `ulaw_8000`, so telephony frames can be forwarded without
transcoding.
</Note>
For Google Meet agent mode, set
`plugins.entries.google-meet.config.realtime.transcriptionProvider` to
`"elevenlabs"` and configure the same provider block under
`plugins.entries.google-meet.config.realtime.providers.elevenlabs`.
## Related
- [Text-to-speech](/tools/tts)
- [Google Meet](/plugins/google-meet)
- [Model selection](/concepts/model-providers)