fix(voice): reuse preflight transcripts across channels

This commit is contained in:
Peter Steinberger
2026-04-26 05:29:24 +01:00
parent 46b9044c3f
commit 6a67f65568
30 changed files with 586 additions and 64 deletions

View File

@@ -58,6 +58,10 @@ Video and music generation run as background tasks because provider processing t
Deepgram, ElevenLabs, Mistral, OpenAI, SenseAudio, and xAI can all transcribe
inbound audio through the batch `tools.media.audio` path when configured.
Channel plugins that preflight a voice note for mention gating or command
parsing mark the transcribed attachment on the inbound context, so the shared
media-understanding pass reuses that transcript instead of making a second STT
call for the same audio.
Deepgram, ElevenLabs, Mistral, OpenAI, and xAI also register Voice Call
streaming STT providers, so live phone audio can be forwarded to the selected
vendor without waiting for a completed recording.