feat(tts): add xiaomi mimo speech provider

2026-05-06 14:30:45 +00:00 · 2026-04-25 09:47:52 +01:00
parent e10f20032a
commit ec8dbc4595
10 changed files with 789 additions and 10 deletions
--- a/docs/providers/xiaomi.md
+++ b/docs/providers/xiaomi.md
@@ -53,6 +53,46 @@ OpenAI-compatible endpoint with API-key authentication.
 The default model ref is `xiaomi/mimo-v2-flash`. The provider is injected automatically when `XIAOMI_API_KEY` is set or an auth profile exists.
 </Tip>

+## Text-to-speech
+
+The bundled `xiaomi` plugin also registers Xiaomi MiMo as a speech provider for
+`messages.tts`. It calls Xiaomi's chat-completions TTS contract with the text as
+an `assistant` message and optional style guidance as a `user` message.
+
+| Property | Value                                    |
+| -------- | ---------------------------------------- |
+| TTS id   | `xiaomi` (`mimo` alias)                  |
+| Auth     | `XIAOMI_API_KEY`                         |
+| API      | `POST /v1/chat/completions` with `audio` |
+| Default  | `mimo-v2.5-tts`, voice `mimo_default`    |
+| Output   | MP3 by default; WAV when configured      |
+
+```json5
+{
+  messages: {
+    tts: {
+      auto: "always",
+      provider: "xiaomi",
+      providers: {
+        xiaomi: {
+          apiKey: "xiaomi_api_key",
+          model: "mimo-v2.5-tts",
+          voice: "mimo_default",
+          format: "mp3",
+          style: "Bright, natural, conversational tone.",
+        },
+      },
+    },
+  },
+}
+```
+
+Supported built-in voices include `mimo_default`, `default_zh`, `default_en`,
+`Mia`, `Chloe`, `Milo`, and `Dean`. `mimo-v2-tts` is supported for older MiMo
+TTS accounts; the default uses the current MiMo-V2.5 TTS model. For voice-note
+targets such as Feishu and Telegram, OpenClaw transcodes Xiaomi output to 48kHz
+Opus with `ffmpeg` before delivery.
+
 ## Config example

 ```json5