feat(google-meet): add transcribe caption health

This commit is contained in:
Peter Steinberger
2026-05-01 14:29:23 +01:00
parent f9b47ad2a1
commit f221bc85a0
9 changed files with 501 additions and 18 deletions

View File

@@ -169,7 +169,12 @@ and will not talk back into the meeting. Chrome joins in this mode also avoid
OpenClaw's microphone/camera permission grant and avoid the Meet **Use
microphone** path. If Meet shows an audio-choice interstitial, automation tries
the no-microphone path and otherwise reports a manual action instead of opening
the local microphone.
the local microphone. In transcribe mode, managed Chrome transports also install
a best-effort Meet caption observer. `googlemeet status --json` and
`googlemeet doctor` surface `captioning`, `captionsEnabledAttempted`,
`transcriptLines`, `lastCaptionAt`, `lastCaptionSpeaker`, `lastCaptionText`,
and a short `recentTranscript` tail so operators can tell whether the browser
joined the call and whether Meet captions are producing text.
During realtime sessions, `google_meet` status includes browser and audio bridge
health such as `inCall`, `manualActionRequired`, `providerConnected`,
@@ -1294,9 +1299,15 @@ openclaw googlemeet doctor
```
Use `mode: "realtime"` for listen/talk-back. `mode: "transcribe"` intentionally
does not start the duplex realtime voice bridge. `googlemeet test-speech`
always checks the realtime path and reports whether bridge output bytes were
observed for that invocation. If `speechOutputVerified` is false and
does not start the duplex realtime voice bridge. For observe-only debugging,
run `openclaw googlemeet status --json <session-id>` after participants speak
and check `captioning`, `transcriptLines`, and `lastCaptionText`. If `inCall` is
true but `transcriptLines` stays at `0`, Meet captions may be disabled, no one
has spoken since the observer was installed, the Meet UI changed, or live
captions are unavailable for the meeting language/account.
`googlemeet test-speech` always checks the realtime path and reports whether
bridge output bytes were observed for that invocation. If `speechOutputVerified` is false and
`speechOutputTimedOut` is true, the realtime provider may have accepted the
utterance but OpenClaw did not see new output bytes reach the Chrome audio
bridge.