mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 13:00:44 +00:00
fix(minimax): normalize tts pitch for api
This commit is contained in:
@@ -255,12 +255,17 @@ The bundled `minimax` plugin registers MiniMax T2A v2 as a speech provider for
|
||||
- Voice-note targets such as Feishu and Telegram are transcoded from MiniMax
|
||||
MP3 to 48kHz Opus with `ffmpeg`, because the Feishu/Lark file API only
|
||||
accepts `file_type: "opus"` for native audio messages.
|
||||
- MiniMax T2A accepts fractional `speed` and `vol`, but `pitch` is sent as an
|
||||
integer; OpenClaw truncates fractional `pitch` values before the API request.
|
||||
|
||||
| Setting | Env var | Default | Description |
|
||||
| ---------------------------------------- | ---------------------- | ----------------------------- | -------------------------------- |
|
||||
| `messages.tts.providers.minimax.baseUrl` | `MINIMAX_API_HOST` | `https://api.minimax.io` | MiniMax T2A API host. |
|
||||
| `messages.tts.providers.minimax.model` | `MINIMAX_TTS_MODEL` | `speech-2.8-hd` | TTS model id. |
|
||||
| `messages.tts.providers.minimax.voiceId` | `MINIMAX_TTS_VOICE_ID` | `English_expressive_narrator` | Voice id used for speech output. |
|
||||
| `messages.tts.providers.minimax.speed` | | `1.0` | Playback speed, `0.5..2.0`. |
|
||||
| `messages.tts.providers.minimax.vol` | | `1.0` | Volume, `(0, 10]`. |
|
||||
| `messages.tts.providers.minimax.pitch` | | `0` | Integer pitch shift, `-12..12`. |
|
||||
|
||||
### Music generation
|
||||
|
||||
|
||||
@@ -374,7 +374,7 @@ Then run:
|
||||
- `providers.minimax.voiceId`: voice identifier (default `English_expressive_narrator`, env: `MINIMAX_TTS_VOICE_ID`).
|
||||
- `providers.minimax.speed`: playback speed `0.5..2.0` (default 1.0).
|
||||
- `providers.minimax.vol`: volume `(0, 10]` (default 1.0; must be greater than 0).
|
||||
- `providers.minimax.pitch`: pitch shift `-12..12` (default 0).
|
||||
- `providers.minimax.pitch`: integer pitch shift `-12..12` (default 0). Fractional values are truncated before calling MiniMax T2A because the API rejects non-integer pitch values.
|
||||
- `providers.google.model`: Gemini TTS model (default `gemini-3.1-flash-tts-preview`).
|
||||
- `providers.google.voiceName`: Gemini prebuilt voice name (default `Kore`; `voice` is also accepted).
|
||||
- `providers.google.baseUrl`: override the Gemini API base URL. Only `https://generativelanguage.googleapis.com` is accepted.
|
||||
@@ -432,7 +432,7 @@ Available directive keys (when enabled):
|
||||
- `model` (OpenAI TTS model, ElevenLabs model id, or MiniMax model) or `google_model` (Google TTS model)
|
||||
- `stability`, `similarityBoost`, `style`, `speed`, `useSpeakerBoost`
|
||||
- `vol` / `volume` (MiniMax volume, 0-10)
|
||||
- `pitch` (MiniMax pitch, -12 to 12)
|
||||
- `pitch` (MiniMax integer pitch, -12 to 12; fractional values are truncated before the MiniMax request)
|
||||
- `applyTextNormalization` (`auto|on|off`)
|
||||
- `languageCode` (ISO 639-1)
|
||||
- `seed`
|
||||
|
||||
Reference in New Issue
Block a user