fix(tts): restore 3.28 schema compatibility and fallback observability (#57953)

* fix(tts): restore legacy config compatibility and fallback observability

* fix(tts): surface fallback attempts in status and telephony

* test(tts): cover /tts audio to /tts status fallback flow

* docs(tts): align migration and fallback observability guidance

* TTS: redact fallback logs and scope legacy plugin migration

* Infra: dedupe UV_EXTRA_INDEX_URL in host env policy

* Docs: scope doctor TTS migration to voice-call

* voice-call: restore strict known TTS provider validation
This commit is contained in:
Josh Avant
2026-03-30 22:05:03 -05:00
committed by GitHub
parent 697dddbeb6
commit c918ab4faf
19 changed files with 838 additions and 154 deletions

View File

@@ -122,6 +122,10 @@ Current migrations:
- `routing.agents`/`routing.defaultAgentId``agents.list` + `agents.list[].default`
- `routing.agentToAgent``tools.agentToAgent`
- `routing.transcribeAudio``tools.media.audio.models`
- `messages.tts.<provider>` (`openai`/`elevenlabs`/`microsoft`/`edge`) → `messages.tts.providers.<provider>`
- `channels.discord.voice.tts.<provider>` (`openai`/`elevenlabs`/`microsoft`/`edge`) → `channels.discord.voice.tts.providers.<provider>`
- `channels.discord.accounts.<id>.voice.tts.<provider>` (`openai`/`elevenlabs`/`microsoft`/`edge`) → `channels.discord.accounts.<id>.voice.tts.providers.<provider>`
- `plugins.entries.voice-call.config.tts.<provider>` (`openai`/`elevenlabs`/`microsoft`/`edge`) → `plugins.entries.voice-call.config.tts.providers.<provider>`
- `bindings[].match.accountID``bindings[].match.accountId`
- For channels with named `accounts` but missing `accounts.default`, move account-scoped top-level single-account channel values into `channels.<channel>.accounts.default` when present
- `identity``agents.list[].identity`

View File

@@ -219,9 +219,11 @@ streaming speech on calls. You can override it under the plugin config with the
{
tts: {
provider: "elevenlabs",
elevenlabs: {
voiceId: "pMsXgVXv3BLzUgSXRplE",
modelId: "eleven_multilingual_v2",
providers: {
elevenlabs: {
voiceId: "pMsXgVXv3BLzUgSXRplE",
modelId: "eleven_multilingual_v2",
},
},
},
}
@@ -229,9 +231,11 @@ streaming speech on calls. You can override it under the plugin config with the
Notes:
- Legacy `tts.<provider>` keys inside plugin config (`openai`, `elevenlabs`, `microsoft`, `edge`) are auto-migrated to `tts.providers.<provider>` on load. Prefer the `providers` shape in committed config.
- **Microsoft speech is ignored for voice calls** (telephony audio needs PCM; the current Microsoft transport does not expose telephony PCM output).
- Core TTS is used when Twilio media streaming is enabled; otherwise calls fall back to provider native voices.
- If a Twilio media stream is already active, Voice Call does not fall back to TwiML `<Say>`. If telephony TTS is unavailable in that state, the playback request fails instead of mixing two playback paths.
- When telephony TTS falls back to a secondary provider, Voice Call logs a warning with the provider chain (`from`, `to`, `attempts`) for debugging.
### More examples
@@ -242,7 +246,9 @@ Use core TTS only (no override):
messages: {
tts: {
provider: "openai",
openai: { voice: "alloy" },
providers: {
openai: { voice: "alloy" },
},
},
},
}
@@ -258,10 +264,12 @@ Override to ElevenLabs just for calls (keep core default elsewhere):
config: {
tts: {
provider: "elevenlabs",
elevenlabs: {
apiKey: "elevenlabs_key",
voiceId: "pMsXgVXv3BLzUgSXRplE",
modelId: "eleven_multilingual_v2",
providers: {
elevenlabs: {
apiKey: "elevenlabs_key",
voiceId: "pMsXgVXv3BLzUgSXRplE",
modelId: "eleven_multilingual_v2",
},
},
},
},
@@ -280,9 +288,11 @@ Override only the OpenAI model for calls (deepmerge example):
"voice-call": {
config: {
tts: {
openai: {
model: "gpt-4o-mini-tts",
voice: "marin",
providers: {
openai: {
model: "gpt-4o-mini-tts",
voice: "marin",
},
},
},
},

View File

@@ -219,6 +219,7 @@ Then run:
- `modelOverrides`: allow the model to emit TTS directives (on by default).
- `allowProvider` defaults to `false` (provider switching is opt-in).
- `providers.<id>`: provider-owned settings keyed by speech provider id.
- Legacy direct provider blocks (`messages.tts.openai`, `messages.tts.elevenlabs`, `messages.tts.microsoft`, `messages.tts.edge`) are auto-migrated to `messages.tts.providers.<id>` on load.
- `maxTextLength`: hard cap for TTS input (chars). `/tts audio` fails if exceeded.
- `timeoutMs`: request timeout (ms).
- `prefsPath`: override the local prefs JSON path (provider/limit/summary).
@@ -391,6 +392,9 @@ Notes:
- `off|always|inbound|tagged` are persession toggles (`/tts on` is an alias for `/tts always`).
- `limit` and `summary` are stored in local prefs, not the main config.
- `/tts audio` generates a one-off audio reply (does not toggle TTS on).
- `/tts status` includes fallback visibility for the latest attempt:
- success fallback: `Fallback: <primary> -> <used>` plus `Attempts: ...`
- failure: `Error: ...` plus `Attempts: ...`
## Agent tool

View File

@@ -219,6 +219,7 @@ Then run:
- `modelOverrides`: allow the model to emit TTS directives (on by default).
- `allowProvider` defaults to `false` (provider switching is opt-in).
- `providers.<id>`: provider-owned settings keyed by speech provider id.
- Legacy direct provider blocks (`messages.tts.openai`, `messages.tts.elevenlabs`, `messages.tts.microsoft`, `messages.tts.edge`) are auto-migrated to `messages.tts.providers.<id>` on load.
- `maxTextLength`: hard cap for TTS input (chars). `/tts audio` fails if exceeded.
- `timeoutMs`: request timeout (ms).
- `prefsPath`: override the local prefs JSON path (provider/limit/summary).
@@ -391,6 +392,9 @@ Notes:
- `off|always|inbound|tagged` are persession toggles (`/tts on` is an alias for `/tts always`).
- `limit` and `summary` are stored in local prefs, not the main config.
- `/tts audio` generates a one-off audio reply (does not toggle TTS on).
- `/tts status` includes fallback visibility for the latest attempt:
- success fallback: `Fallback: <primary> -> <used>` plus `Attempts: ...`
- failure: `Error: ...` plus `Attempts: ...`
## Agent tool