fix(whatsapp): send voice note text separately

This commit is contained in:
Peter Steinberger
2026-04-25 18:54:33 +01:00
parent 617e1dd6bf
commit 9ffe764416
9 changed files with 64 additions and 16 deletions

View File

@@ -365,7 +365,7 @@ When the linked self number is also present in `allowFrom`, WhatsApp self-chat s
- non-Ogg audio, including Microsoft Edge TTS MP3/WebM output, is transcoded to Ogg/Opus before PTT delivery
- native Ogg/Opus audio is sent with `audio/ogg; codecs=opus` for voice-note compatibility
- animated GIF playback is supported via `gifPlayback: true` on video sends
- captions are applied to the first media item when sending multi-media reply payloads
- captions are applied to the first media item when sending multi-media reply payloads, except PTT voice notes send the audio first and visible text separately because WhatsApp clients do not render voice-note captions consistently
- media source can be HTTP(S), `file://`, or local paths
</Accordion>

View File

@@ -664,6 +664,8 @@ reply delivery. When the channel is Feishu, Matrix, Telegram, or WhatsApp,
the audio is delivered as a voice message rather than a file attachment.
Feishu can transcode non-Opus TTS output on this path when `ffmpeg` is
available.
WhatsApp sends visible text separately from PTT voice-note audio because clients
do not consistently render captions on voice notes.
It accepts optional `channel` and `timeoutMs` fields; `timeoutMs` is a
per-call provider request timeout in milliseconds.