openclaw

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-06 08:40:44 +00:00

Author	SHA1	Message	Date
Peter Steinberger	24853ced11	docs: outline unified talk API	2026-05-06 02:39:15 +01:00
Peter Steinberger	2dfa4b082a	docs: sync docs with source truth	2026-05-02 21:45:03 +01:00
Peter Steinberger	c02605253d	fix: require explicit TTS intent	2026-05-02 03:16:57 +01:00
Peter Steinberger	5e3265b09b	feat: support openai tts extra body	2026-05-01 22:57:35 +01:00
Peter Steinberger	0294aebe6f	feat(providers): add DeepInfra provider plugin (#73038 ) * feat(providers): add DeepInfra provider plugin * feat(deepinfra): add media provider surfaces * fix(deepinfra): satisfy provider boundary checks * docs: add gitcrawl maintainer skill * test: include deepinfra in live media sweeps * fix: remove stale tts contract import	2026-04-28 01:12:54 +01:00
Peter Steinberger	d419fb561d	feat(tts): resolve channel account config generically	2026-04-26 08:10:36 +01:00
Peter Steinberger	d613c8e29b	refactor(tts): resolve voice delivery from channel capabilities	2026-04-26 07:03:25 +01:00
Vincent Koc	724e92505a	docs(tts): add sidebarTitle 'Text to speech (TTS)' for the nav Default sidebar label fell back to title 'Text-to-speech', which is fine on the page header but readers scanning the Tools sidebar look for the acronym 'TTS'. Add a sidebarTitle so Mintlify renders 'Text to speech (TTS)' in the sidebar while keeping the canonical page title intact. Sentence case matches the rest of the Tools sidebar group (e.g. 'Image generation', 'Music generation', 'Video generation').	2026-04-25 22:11:31 -07:00
Vincent Koc	fbd6b3ce3c	docs(tts): A-Z order providers and add tools/tts to Tools nav group - docs/tools/tts.md: alphabetize providers in three places that listed them: the supported-providers table (Azure Speech ... Xiaomi MiMo), the configuration Tabs (12 provider presets in A-Z), and the field reference AccordionGroup. Top-level fields stay first; provider tabs/accordions follow strict alphabetical order. Wording, schema, and defaults unchanged. - docs/docs.json: add tools/tts to the main Tools sidebar group (slotted between trajectory and video-generation, matching the alphabetical neighborhood with image-generation, music-generation, video-generation). Previously tts only appeared under Nodes > Media capabilities, which was a discoverability gap for readers looking for TTS alongside the other generation tools.	2026-04-25 22:05:46 -07:00
Vincent Koc	71b79f49ad	docs(tts): rewrite tts.md around personas with Mintlify components The TTS doc had grown to 1008 lines with 11 separate flat 'X primary' config blocks, a 100-line dense 'Notes on fields' bullet list, and the new provider-personas feature (#70748) buried near the bottom. Restructure for readability and feature visibility: - Lead with a Steps-based 'Quick start' so first-time readers can enable TTS in 4 explicit steps. - Replace the 13-bullet provider list with a single 'Supported providers' table that names auth env vars and per-provider notes inline. Add a Warning callout for the Microsoft/edge legacy alias. - Collapse the 11 'X primary' config blocks into one Tabs component ('OpenAI + ElevenLabs', 'Google Gemini', 'Azure Speech', 'Microsoft (no key)', 'MiniMax', 'Inworld', 'xAI', 'Volcengine', 'Xiaomi MiMo', 'OpenRouter', 'Gradium', 'Local CLI') so users see one preset at a time and the page is scannable. - Promote 'Personas' to its own top-level section with two examples (minimal and the Alfred provider-neutral persona), and add a new 'How providers use persona prompts' AccordionGroup covering Google (promptTemplate audio-profile-v1, personaPrompt), OpenAI (instructions auto-mapping), and Other providers, plus a fallback policy table. - Note that agents.list[].tts.persona overrides global persona per-agent (covers the recent feat(tts) per-agent voice-override work). - Convert the 100-line 'Notes on fields' wall into a per-provider AccordionGroup using ParamField, so the field reference is scannable and field types/defaults are visually distinct. - Sentence-case headings, drop redundant body H1, fold the flow diagram inline with Auto-TTS behavior, and refresh the Output formats section to a table-first layout. - Schema fields (label/description/provider/fallbackPolicy/prompt with profile/scene/sampleContext/style/accent/pacing/constraints and providers map) verified against src/config/types.tts.ts; all defaults and env-var fallbacks preserved verbatim. Net diff: 585 insertions, 684 deletions across the same surface area.	2026-04-25 22:00:19 -07:00
Barron Roth	0594fa3c4d	TTS: add provider personas	2026-04-26 09:42:38 +05:30
Peter Steinberger	2c8c79de5c	fix(tts): normalize streamed tts voice media	2026-04-26 04:28:19 +01:00
Peter Steinberger	a91baa16de	fix(tts): honor explicit directive providers	2026-04-26 04:14:48 +01:00
Peter Steinberger	cf834e2a21	fix(tts): clean streamed directive text	2026-04-26 04:09:56 +01:00
Peter Steinberger	7a85c1a822	fix(tts): surface voice status and harden providers	2026-04-26 03:51:30 +01:00
Peter Steinberger	97ae1c7c2e	feat(tts): add read-latest voice command	2026-04-26 03:44:44 +01:00
Peter Steinberger	6855b33255	docs(tts): clarify WhatsApp voice-note delivery	2026-04-26 03:28:51 +01:00
Peter Steinberger	9b91040053	fix(tts): route WhatsApp MP3 TTS as voice notes	2026-04-26 03:26:00 +01:00
Peter Steinberger	9b4f0779ce	fix(tts): honor per-agent config in tts commands	2026-04-26 03:12:30 +01:00
Peter Steinberger	0ca952cdd5	feat(tts): add per-agent voice overrides	2026-04-26 02:54:13 +01:00
Peter Steinberger	5b80d0c15e	feat(tts): add Azure Speech provider Co-authored-by: Leon Chui <84605354+leonchui@users.noreply.github.com>	2026-04-26 01:42:51 +01:00
Rui Xu	1531123d35	feat(tts): add BytePlus Seed Speech provider Add Volcengine/BytePlus Seed Speech as a bundled TTS provider with current API-key auth, legacy AppID/token fallback, native Ogg/Opus voice-note output, and MP3 audio-file output. Co-authored-by: Peter Steinberger <steipete@gmail.com>	2026-04-25 23:46:04 +01:00
Cale Shapera	0bcb4c95c1	feat(tts): add Inworld speech provider (#55972 ) Adds the bundled Inworld speech provider with docs, config surface, SSRF-guarded fetches, directive overrides, native voice-note/telephony output coverage, and live `.profile` verification. Co-authored-by: cshape <cshape@users.noreply.github.com>	2026-04-25 22:33:21 +01:00
Peter Steinberger	e2fd3dcee9	fix(google): emit opus voice-note tts	2026-04-25 21:33:33 +01:00
Peter Steinberger	9ffe764416	fix(whatsapp): send voice note text separately	2026-04-25 18:55:03 +01:00
Peter Steinberger	b511250e5c	feat(media): add voice conversion and speech plugins	2026-04-25 12:12:33 +01:00
Peter Steinberger	a7604f8170	fix(minimax): support token plan tts auth	2026-04-25 10:36:12 +01:00
Peter Steinberger	ec8dbc4595	feat(tts): add xiaomi mimo speech provider	2026-04-25 09:48:05 +01:00
Peter Steinberger	b0c55eb659	fix(feishu): transcode voice TTS audio	2026-04-25 09:26:42 +01:00
Peter Steinberger	8acc92c881	feat(google): support Gemini TTS style profile	2026-04-25 06:11:23 +01:00
Peter Steinberger	e31aef7e19	fix(tts): migrate legacy edge config in doctor	2026-04-25 05:55:54 +01:00
Peter Steinberger	c03e5b3c3a	docs(tts): clarify legacy provider migration	2026-04-25 05:01:09 +01:00
Peter Steinberger	978a50a3c5	fix(minimax): normalize tts pitch for api	2026-04-25 04:58:20 +01:00
Peter Steinberger	225ff9a866	fix(minimax): transcode voice-note tts to opus	2026-04-25 04:52:25 +01:00
Peter Steinberger	7875092f4d	feat(openrouter): add tts provider	2026-04-25 04:36:49 +01:00
Laurent Mazare	d7e2939791	feat: add Gradium text-to-speech provider (#64958 ) Adds the Gradium bundled plugin with TTS and speech-provider registration, docs, label routing, and focused/live coverage. Also carries the current main lint cleanup needed for the rebased CI lane. Co-authored-by: laurent <laurent.mazare@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-24 18:43:53 +01:00
Peter Steinberger	f0a7a85e7a	feat(agents): add generation tool timeouts	2026-04-24 00:05:38 +01:00
Vincent Koc	789e71cdb8	docs: remove H1 on pages where frontmatter + summary already cover the parenthetical	2026-04-23 15:47:48 -07:00
Vincent Koc	6667f66fd8	docs(tools): add Related sections and unify See also to Related	2026-04-23 15:41:56 -07:00
Vincent Koc	2777b089b5	docs: normalize frontmatter titles to sentence case	2026-04-23 13:15:17 -07:00
KateWilkins	f342da5fcc	feat: add xai media providers Add xAI image generation and text-to-speech provider support with docs, live tests, and guarded provider HTTP handling.\n\nThanks @KateWilkins.	2026-04-23 00:07:39 +01:00
Barron Roth	bf59917cd1	fix: add Google Gemini TTS provider (#67515 ) (thanks @barronlroth) * Add Google Gemini TTS provider * Remove committed planning artifact * Explain Google media provider type shape * google: distill Gemini TTS provider * fix: add Google Gemini TTS provider (#67515) (thanks @barronlroth) * fix: honor cfg-backed Google TTS selection (#67515) (thanks @barronlroth) * fix: narrow Google TTS directive aliases (#67515) (thanks @barronlroth) --------- Co-authored-by: Ayaan Zaidi <hi@obviy.us>	2026-04-16 11:54:35 +05:30
Marcus Castro	403783a3b1	fix(tts): correct tagged TTS syntax guidance (#65573 )	2026-04-12 19:41:13 -03:00
Gustavo Madeira Santana	17a2290f49	Docs: refresh schema, slash commands, and TTS refs	2026-04-08 01:10:00 -04:00
gnuduncan	e934211170	fix(minimax): use global TTS endpoint default and add missing Talk Mode overrides Switch DEFAULT_MINIMAX_TTS_BASE_URL from api.minimaxi.com (CN) to api.minimax.io (global) so international API keys work out of the box. Add vol and pitch to resolveTalkOverrides for parity with resolveTalkConfig. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 09:19:45 +01:00
gnuduncan	7d7f5d85b4	feat(minimax): add native TTS speech provider (T2A v2) Add MiniMax as a fourth TTS provider alongside OpenAI, ElevenLabs, and Microsoft. Registers a SpeechProviderPlugin in the existing minimax extension with config resolution, directive parsing, and Talk Mode support. Hex-encoded audio response from the T2A v2 API is decoded to MP3. Closes #52720 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 09:19:45 +01:00
Josh Avant	44674525f2	feat(tts): add structured provider diagnostics and fallback attempt analytics (#57954 ) * feat(tts): add structured fallback diagnostics and attempt analytics * docs(tts): document attempt-detail and provider error diagnostics * TTS: harden fallback loops and share error helpers * TTS: bound provider error-body reads * tts: add double-prefix regression test and clean baseline drift * tests(tts): satisfy error narrowing in double-prefix regression * changelog Signed-off-by: joshavant <830519+joshavant@users.noreply.github.com> --------- Signed-off-by: joshavant <830519+joshavant@users.noreply.github.com>	2026-03-30 22:55:28 -05:00
Josh Avant	c918ab4faf	fix(tts): restore 3.28 schema compatibility and fallback observability (#57953 ) * fix(tts): restore legacy config compatibility and fallback observability * fix(tts): surface fallback attempts in status and telephony * test(tts): cover /tts audio to /tts status fallback flow * docs(tts): align migration and fallback observability guidance * TTS: redact fallback logs and scope legacy plugin migration * Infra: dedupe UV_EXTRA_INDEX_URL in host env policy * Docs: scope doctor TTS migration to voice-call * voice-call: restore strict known TTS provider validation	2026-03-30 22:05:03 -05:00
Peter Steinberger	01bcbcf8d5	refactor: require legacy config migration on read	2026-03-26 23:23:47 +00:00
Jealous	2c3cf4f387	chore(tts): rename VOICE_BUBBLE identifiers to OPUS and update docs	2026-03-25 10:49:21 +05:30

1 2

52 Commits