feat(tts): add per-agent voice overrides

2026-05-06 11:00:42 +00:00 · 2026-04-26 02:45:45 +01:00
parent 1bc9bada65
commit 0ca952cdd5
31 changed files with 605 additions and 34 deletions
--- a/docs/.generated/config-baseline.sha256
+++ b/docs/.generated/config-baseline.sha256
@@ -1,4 +1,4 @@
-211e9d4cdb309e7fe0c1ed91d060201240a9287f8c5cb3c893aba3f904a20d30  config-baseline.json
-ffda2d2911adc03148a368f3b40b17cbdcb7af0066bccdc555e8d596cdea8cda  config-baseline.core.json
+3efb041739877bd5387ffc87e0ddd11be43d80d38e7779407ce8091dcb797e5e  config-baseline.json
+5c6e35c5846f654d717d4b20853649e0b45a746423834f539b2a2223abcd5226  config-baseline.core.json
 7cd9c908f066c143eab2a201efbc9640f483ab28bba92ddeca1d18cc2b528bc3  config-baseline.channel.json
-9e131d7734f8b9cc9e7f8af6cc6b6dc81c9971dc551fadbe66fb0d682173f32d  config-baseline.plugin.json
+a5479c182ec987bb21e814b8a4e7b3bda7190ae5c2b35fd5ca403dfa48afa115  config-baseline.plugin.json
--- a/docs/.generated/plugin-sdk-api-baseline.sha256
+++ b/docs/.generated/plugin-sdk-api-baseline.sha256
@@ -1,2 +1,2 @@
-c911117176b41eebf26470618274a7e093910e9b36855bc045bc8a92f6856745  plugin-sdk-api-baseline.json
-ff360635f95beb217b9dd207a87eaf331319a7671aea03acfe05911756741b21  plugin-sdk-api-baseline.jsonl
+6eb33044c2a4726f1aeb2d18052643c38c8bf5244bb970f969b1583365063e8b  plugin-sdk-api-baseline.json
+06e70516047f98d78963c238f1671feb3eea7c7e559c6fa84f403b9562028bb2  plugin-sdk-api-baseline.jsonl
--- a/docs/gateway/config-agents.md
+++ b/docs/gateway/config-agents.md
@@ -915,6 +915,11 @@ scripts/sandbox-browser-setup.sh   # optional browser image
        fastModeDefault: false, // per-agent fast mode override
        embeddedHarness: { runtime: "auto", fallback: "pi" },
        params: { cacheRetention: "none" }, // overrides matching defaults.models params by key
+        tts: {
+          providers: {
+            elevenlabs: { voiceId: "EXAVITQu4vr4xnSDxMaL" },
+          },
+        },
        skills: ["docs-search"], // replaces agents.defaults.skills when set
        identity: {
          name: "Samantha",
@@ -950,6 +955,7 @@ scripts/sandbox-browser-setup.sh   # optional browser image
 - `default`: when multiple are set, first wins (warning logged). If none set, first list entry is default.
 - `model`: string form overrides `primary` only; object form `{ primary, fallbacks }` overrides both (`[]` disables global fallbacks). Cron jobs that only override `primary` still inherit default fallbacks unless you set `fallbacks: []`.
 - `params`: per-agent stream params merged over the selected model entry in `agents.defaults.models`. Use this for agent-specific overrides like `cacheRetention`, `temperature`, or `maxTokens` without duplicating the whole model catalog.
+- `tts`: optional per-agent text-to-speech overrides. The block deep-merges over `messages.tts`, so keep shared provider credentials and fallback policy in `messages.tts` and set only persona-specific values such as provider, voice, model, style, or auto mode here.
 - `skills`: optional per-agent skill allowlist. If omitted, the agent inherits `agents.defaults.skills` when set; an explicit list replaces defaults instead of merging, and `[]` means no skills.
 - `thinkingDefault`: optional per-agent default thinking level (`off | minimal | low | medium | high | xhigh | adaptive | max`). Overrides `agents.defaults.thinkingDefault` for this agent when no per-message or session override is set. The selected provider/model profile controls which values are valid; for Google Gemini, `adaptive` keeps provider-owned dynamic thinking (`thinkingLevel` omitted on Gemini 3/3.1, `thinkingBudget: -1` on Gemini 2.5).
 - `reasoningDefault`: optional per-agent default reasoning visibility (`on | off | stream`). Applies when no per-message or session reasoning override is set.
--- a/docs/reference/secretref-credential-surface.md
+++ b/docs/reference/secretref-credential-surface.md
@@ -35,6 +35,7 @@ Scope intent:
 - `models.providers.*.request.tls.passphrase`
 - `skills.entries.*.apiKey`
 - `agents.defaults.memorySearch.remote.apiKey`
+- `agents.list[].tts.providers.*.apiKey`
 - `agents.list[].memorySearch.remote.apiKey`
 - `talk.providers.*.apiKey`
 - `messages.tts.providers.*.apiKey`
--- a/docs/reference/secretref-user-supplied-credentials-matrix.json
+++ b/docs/reference/secretref-user-supplied-credentials-matrix.json
@@ -29,6 +29,13 @@
      "secretShape": "secret_input",
      "optIn": true
    },
+    {
+      "id": "agents.list[].tts.providers.*.apiKey",
+      "configFile": "openclaw.json",
+      "path": "agents.list[].tts.providers.*.apiKey",
+      "secretShape": "secret_input",
+      "optIn": true
+    },
    {
      "id": "auth-profiles.api_key.key",
      "configFile": "auth-profiles.json",
--- a/docs/tools/tts.md
+++ b/docs/tools/tts.md
@@ -109,6 +109,50 @@ Full schema is in [Gateway configuration](/gateway/configuration).
 }
 ```

+### Per-agent voice overrides
+
+Use `agents.list[].tts` when one agent should speak with a different provider,
+voice, model, style, or auto-TTS mode. The agent block deep-merges over
+`messages.tts`, so provider credentials can stay in the global provider config.
+
+```json5
+{
+  messages: {
+    tts: {
+      auto: "always",
+      provider: "elevenlabs",
+      providers: {
+        elevenlabs: {
+          apiKey: "${ELEVENLABS_API_KEY}",
+          model: "eleven_multilingual_v2",
+        },
+      },
+    },
+  },
+  agents: {
+    list: [
+      {
+        id: "reader",
+        tts: {
+          providers: {
+            elevenlabs: {
+              voiceId: "EXAVITQu4vr4xnSDxMaL",
+            },
+          },
+        },
+      },
+    ],
+  },
+}
+```
+
+Precedence for automatic replies is:
+
+1. `messages.tts`
+2. active `agents.list[].tts`
+3. local `/tts` preferences for this host
+4. inline `[[tts:...]]` directives when model overrides are enabled
+
 ### OpenAI primary with ElevenLabs fallback

 ```json5
@@ -702,7 +746,8 @@ Stored fields:
 - `maxLength` (summary threshold; default 1500 chars)
 - `summarize` (default `true`)

-These override `messages.tts.*` for that host.
+These override the effective config from `messages.tts` plus the active
+`agents.list[].tts` block for that host.

 ## Output formats (fixed)