fix(google-meet): use PCM audio for Chrome realtime

This commit is contained in:
Peter Steinberger
2026-04-27 12:54:54 +01:00
parent 27a4bba90a
commit d73e2ee774
19 changed files with 395 additions and 59 deletions

View File

@@ -1,4 +1,4 @@
5027142b42acd038bb3cd15e53a0d45293103448a3aee1072500352095e14242 config-baseline.json
33425d446eda183d3574ee754bb44e7e546ea33afa855fc979f94b1e102bf047 config-baseline.json
ecb702eee54bcb697916944440e13208ac7a640a8e07f44072bb79e9284ca994 config-baseline.core.json
07963db49502132f26db396c56b36e018b110e6c55a68b3cb012d3ec96f43901 config-baseline.channel.json
ed65cefbef96f034ce2b73069d9d5bacc341a43489ff9b20a34d40956b877f79 config-baseline.plugin.json
13d038300d90d4dd064aa2ac79def867799d1be403cf9d3e81dfad35ef459a21 config-baseline.plugin.json

View File

@@ -336,7 +336,7 @@ Common failure checks:
The Chrome realtime default uses two external tools:
- `sox`: command-line audio utility. The plugin uses its `rec` and `play`
commands for the default 8 kHz G.711 mu-law audio bridge.
commands for the default 24 kHz PCM16 audio bridge.
- `blackhole-2ch`: macOS virtual audio driver. It creates the `BlackHole 2ch`
audio device that Chrome/Meet can route through.
@@ -887,10 +887,13 @@ Defaults:
opening duplicates
- `chrome.waitForInCallMs: 20000`: wait for the Meet tab to report in-call
before the realtime intro is triggered
- `chrome.audioInputCommand`: SoX `rec` command writing 8 kHz G.711 mu-law
audio to stdout
- `chrome.audioOutputCommand`: SoX `play` command reading 8 kHz G.711 mu-law
audio from stdin
- `chrome.audioFormat: "pcm16-24khz"`: command-pair audio format. Use
`"g711-ulaw-8khz"` only for legacy/custom command pairs that still emit
telephony audio.
- `chrome.audioInputCommand`: SoX `rec` command writing audio in
`chrome.audioFormat`
- `chrome.audioOutputCommand`: SoX `play` command reading audio in
`chrome.audioFormat`
- `realtime.provider: "openai"`
- `realtime.toolPolicy: "safe-read-only"`
- `realtime.instructions`: brief spoken replies, with
@@ -1313,8 +1316,9 @@ phone dial-in participation.
Chrome realtime mode needs either:
- `chrome.audioInputCommand` plus `chrome.audioOutputCommand`: OpenClaw owns the
realtime model bridge and pipes 8 kHz G.711 mu-law audio between those
commands and the selected realtime voice provider.
realtime model bridge and pipes audio in `chrome.audioFormat` between those
commands and the selected realtime voice provider. The default Chrome path is
24 kHz PCM16; 8 kHz G.711 mu-law remains available for legacy command pairs.
- `chrome.audioBridgeCommand`: an external bridge command owns the whole local
audio path and must exit after starting or validating its daemon.