docs: document google meet elevenlabs voice setup

2026-05-06 05:50:43 +00:00 · 2026-05-04 07:16:39 +01:00
parent 02f455fda3
commit 70850d15ee
2 changed files with 60 additions and 10 deletions
--- a/docs/plugins/google-meet.md
+++ b/docs/plugins/google-meet.md
@@ -1131,6 +1131,50 @@ Optional overrides:
 }
 ```

+ElevenLabs for both agent-mode listening and speaking:
+
+```json5
+{
+  messages: {
+    tts: {
+      provider: "elevenlabs",
+      providers: {
+        elevenlabs: {
+          modelId: "eleven_v3",
+          voiceId: "pMsXgVXv3BLzUgSXRplE",
+        },
+      },
+    },
+  },
+  plugins: {
+    entries: {
+      "google-meet": {
+        config: {
+          realtime: {
+            transcriptionProvider: "elevenlabs",
+            providers: {
+              elevenlabs: {
+                modelId: "scribe_v2_realtime",
+                audioFormat: "ulaw_8000",
+                sampleRate: 8000,
+                commitStrategy: "vad",
+              },
+            },
+          },
+        },
+      },
+    },
+  },
+}
+```
+
+The persistent Meet voice comes from
+`messages.tts.providers.elevenlabs.voiceId`. Agent replies can also use
+per-reply `[[tts:voiceId=... model=eleven_v3]]` directives when TTS model
+overrides are enabled, but config is the deterministic default for meetings.
+On join, the logs should show `transcriptionProvider=elevenlabs` and each
+spoken reply should log `provider=elevenlabs model=eleven_v3 voice=<voiceId>`.
+
 Twilio-only config:

 ```json5
--- a/docs/providers/elevenlabs.md
+++ b/docs/providers/elevenlabs.md
@@ -3,18 +3,18 @@ summary: "Use ElevenLabs speech, Scribe STT, and realtime transcription with Ope
 read_when:
  - You want ElevenLabs text-to-speech in OpenClaw
  - You want ElevenLabs Scribe speech-to-text for audio attachments
-  - You want ElevenLabs realtime transcription for Voice Call
+  - You want ElevenLabs realtime transcription for Voice Call or Google Meet
 title: "ElevenLabs"
 ---

 OpenClaw uses ElevenLabs for text-to-speech, batch speech-to-text with Scribe
-v2, and Voice Call streaming STT with Scribe v2 Realtime.
+v2, and streaming STT with Scribe v2 Realtime.

-| Capability               | OpenClaw surface                              | Default                  |
-| ------------------------ | --------------------------------------------- | ------------------------ |
-| Text-to-speech           | `messages.tts` / `talk`                       | `eleven_multilingual_v2` |
-| Batch speech-to-text     | `tools.media.audio`                           | `scribe_v2`              |
-| Streaming speech-to-text | Voice Call `streaming.provider: "elevenlabs"` | `scribe_v2_realtime`     |
+| Capability               | OpenClaw surface                                                     | Default                  |
+| ------------------------ | -------------------------------------------------------------------- | ------------------------ |
+| Text-to-speech           | `messages.tts` / `talk`                                              | `eleven_multilingual_v2` |
+| Batch speech-to-text     | `tools.media.audio`                                                  | `scribe_v2`              |
+| Streaming speech-to-text | Voice Call streaming or Google Meet `realtime.transcriptionProvider` | `scribe_v2_realtime`     |

 ## Authentication

@@ -66,10 +66,10 @@ Use Scribe v2 for inbound audio attachments and short recorded voice segments:
 OpenClaw sends multipart audio to ElevenLabs `/v1/speech-to-text` with
 `model_id: "scribe_v2"`. Language hints map to `language_code` when present.

-## Voice Call streaming STT
+## Streaming STT

-The bundled `elevenlabs` plugin registers Scribe v2 Realtime for Voice Call
-streaming transcription.
+The bundled `elevenlabs` plugin registers Scribe v2 Realtime for Voice Call and
+Google Meet agent-mode streaming transcription.

 | Setting         | Config path                                                               | Default                                           |
 | --------------- | ------------------------------------------------------------------------- | ------------------------------------------------- |
@@ -111,7 +111,13 @@ provider defaults to `ulaw_8000`, so telephony frames can be forwarded without
 transcoding.
 </Note>

+For Google Meet agent mode, set
+`plugins.entries.google-meet.config.realtime.transcriptionProvider` to
+`"elevenlabs"` and configure the same provider block under
+`plugins.entries.google-meet.config.realtime.providers.elevenlabs`.
+
 ## Related

 - [Text-to-speech](/tools/tts)
+- [Google Meet](/plugins/google-meet)
 - [Model selection](/concepts/model-providers)