diff --git a/docs/plugins/google-meet.md b/docs/plugins/google-meet.md index a1966b25d41..87a0c4e7985 100644 --- a/docs/plugins/google-meet.md +++ b/docs/plugins/google-meet.md @@ -169,7 +169,7 @@ Set config under `plugins.entries.google-meet.config`: realtime: { provider: "openai", model: "gpt-realtime", - instructions: "You are joining a private Google Meet as Peter's OpenClaw agent. Keep replies brief unless asked.", + instructions: "You are joining a private Google Meet as my OpenClaw agent. Keep replies brief unless asked.", toolPolicy: "safe-read-only", providers: { openai: { @@ -207,6 +207,29 @@ Agents can use the `google_meet` tool: Use `action: "status"` to list active sessions or inspect a session ID. Use `action: "leave"` to mark a session ended. +## Realtime agent consult + +Chrome realtime mode is optimized for a live voice loop. The realtime voice +provider hears the meeting audio and speaks through the configured audio bridge. +When the realtime model needs deeper reasoning, current information, or normal +OpenClaw tools, it can call `openclaw_agent_consult`. + +The consult tool runs the regular OpenClaw agent behind the scenes with recent +meeting transcript context and returns a concise spoken answer to the realtime +voice session. The voice model can then speak that answer back into the meeting. + +`realtime.toolPolicy` controls the consult run: + +- `safe-read-only`: expose the consult tool and limit the regular agent to + `read`, `web_search`, `web_fetch`, `x_search`, `memory_search`, and + `memory_get`. +- `owner`: expose the consult tool and let the regular agent use the normal + agent tool policy. +- `none`: do not expose the consult tool to the realtime voice model. + +The consult session key is scoped per Meet session, so follow-up consult calls +can reuse prior consult context during the same meeting. + ## Notes Google Meet's official media API is receive-oriented, so speaking into a Meet @@ -214,12 +237,6 @@ call still needs a participant path. This plugin keeps that boundary visible: Chrome handles browser participation and local audio routing; Twilio handles phone dial-in participation. -Realtime mode gives the voice model one tool, `openclaw_agent_consult`, unless -`realtime.toolPolicy` is `none`. The tool asks the normal OpenClaw agent for a -concise spoken answer, using recent meeting transcript as context. With -`safe-read-only`, the consult run is limited to read/search/memory tools. With -`owner`, it inherits the normal agent tool policy. - Chrome realtime mode needs either: - `chrome.audioInputCommand` plus `chrome.audioOutputCommand`: OpenClaw owns the @@ -231,3 +248,7 @@ Chrome realtime mode needs either: For clean duplex audio, route Meet output and Meet microphone through separate virtual devices or a Loopback-style virtual device graph. A single shared BlackHole device can echo other participants back into the call. + +`googlemeet leave` stops the command-pair realtime audio bridge for Chrome +sessions. For Twilio sessions delegated through the Voice Call plugin, it also +hangs up the underlying voice call.