openclaw

mirror of https://github.com/openclaw/openclaw.git synced 2026-06-05 07:52:54 +00:00

Author	SHA1	Message	Date
giming	f94512cd7f	fix(xiaomi): support MiMo voicedesign TTS Adds Xiaomi MiMo voicedesign TTS support by registering the v2.5 voicedesign model and omitting audio.voice for that model's prompt-driven voice design flow. Also accepts generic TTS aliases modelId, speakerVoice, and speakerVoiceId for Xiaomi provider config and request overrides. Fixes exec timeout classification so a process that exits after a missed timeout callback is still reported as timed out, using monotonic deadlines to avoid wall-clock skew. Verification: - node scripts/run-vitest.mjs extensions/xiaomi/speech-provider.test.ts - node scripts/run-vitest.mjs src/process/supervisor/supervisor.test.ts - node scripts/run-vitest.mjs src/agents/bash-tools.exec-foreground-failures.test.ts - git diff --check - autoreview --mode local - live Xiaomi MiMo voicedesign call returned wav RIFF/WAVE output, 169004 bytes - GitHub CI success on `fb3018ef31`: CI 26708919072, CodeQL Critical Quality 26708919082, CodeQL 26708919091, OpenGrep PR Diff 26708919089, Workflow Sanity 26708919083, Dependency Guard 26708918574, Real behavior proof 26708921767 Thanks @GimingRao. Co-authored-by: Raoyu <2425198313@qq.com> Co-authored-by: giming <53329020+GimingRao@users.noreply.github.com>	2026-05-31 10:34:51 +01:00
scotthuang	7920af0c9e	refactor: route browser screenshot vision through shared media understanding * feat(browser): add optional vision understanding to screenshot tool * fix(browser): wrap vision output as external content, enforce maxBytes, forward auth profiles * fix(browser): remove no-op scope/attachments config, drop profile pass-through lacking runtime support * feat(media-understanding): add profile/preferredProfile to DescribeImageFileWithModelParams and forward to describeImage * style(browser): add curly braces to satisfy eslint curly rule * fix(browser): correct tools.browser.enabled help text to match actual behavior * fix(browser): thread agentDir/workspaceDir from plugin tool context into browser vision * refactor(browser): move vision config from tools.browser to browser.models The browser plugin's vision configuration now lives on the top-level `browser` config namespace (browser.models, browser.visionEnabled, browser.visionPrompt, etc.) instead of `tools.browser`. This aligns with the plugin's existing config location and avoids confusion between tool-level and plugin-level settings. - Remove tools.browser from ToolsSchema and ToolsConfig - Add models/vision* fields to BrowserConfig and its zod schema - Update getBrowserVisionConfig to read from cfg.browser - Update schema help, labels, and quality test - Update vision.test.ts to use new config shape * docs(browser): add screenshot vision configuration section Document the new browser.models config for automatic screenshot description via vision models, enabling text-only main models to reason about web page content. * fix(browser): remove deliverable media markers from vision result, drop unused import P1: Vision-success path no longer exposes the raw screenshot as deliverable media (removes MEDIA: line and details.media.mediaUrl). This prevents channel delivery from auto-sending sensitive page content when the intended output is a text description. P2: Remove unused ToolsMediaUnderstandingSchema import that would fail noUnusedLocals typecheck. * fix(browser): add command/args fields to browser models schema The browser vision model schema uses .strict(), so CLI-type entries with command/args were rejected by TypeScript. Add these fields to align with MediaUnderstandingModelSchema. * chore(browser): remove debug console.log statements * fix(browser): harden screenshot vision result against MEDIA: directive injection and restore image sanitization on failure fallback ClawSweeper #84247 review round 2: P1 (security, high): neutralize line-start MEDIA: directives in vision descriptions before wrapping with wrapExternalContent. The agent media extractor scans every browser tool-result text block via splitMediaFromOutput which treats line-start MEDIA: as a trusted local-media delivery directive, and browser is on the trusted-media allowlist. Without neutralization, page or vision-provider output containing 'MEDIA:/tmp/secret.png' could synthesize a channel-deliverable media artifact from untrusted content. wrapExternalContent itself does not strip line-start directives. Introduce neutralizeMediaDirectives in vision.ts that prepends '[neutralized] ' to any line whose trimStart() begins with MEDIA: (case-insensitive), defanging the parser anchor while keeping the original text human-readable. P2 (compatibility): pass resolveRuntimeImageSanitization() to imageResultFromFile in the vision-failure catch fallback. The non-vision screenshot path already forwards this option (`d5cc0d53b7`) so configured agents.defaults.imageMaxDimensionPx takes effect. Without this fix, any provider timeout/error silently bypasses the sanitization guard and returns a raw full-resolution screenshot. Regression coverage: - vision.test.ts: 6 unit cases for neutralizeMediaDirectives (no-op fast path, mid-line MEDIA: untouched, line-start defanged, leading-whitespace defanged, case-insensitive, multiple directives per blob). - browser-tool.test.ts: 2 integration cases that drive the full screenshot tool execute path: - 'neutralizes MEDIA: directives in vision text and does not attach media' asserts no line matches /^\sMEDIA:/i in returned text, secret path text is preserved verbatim, details.media is absent, and imageResultFromFile is not called on the success path. - 'preserves screenshot image sanitization on vision failure fallback' mocks describeImageFileWithModel to reject and asserts the fallback imageResultFromFile call receives imageSanitization: {maxDimensionPx:1600} plus the 'browser screenshot vision failed' extraText. fix(browser): apply clawsweeper fallback media fix from PR #84247 * refactor: reuse media image understanding for browser screenshots * refactor: use structured media delivery * test: update music completion media instruction expectation * fix: trim buffered reply directive padding * test: refresh codex prompt snapshots for message media aliases --------- Co-authored-by: scotthuang <scotthuang@tencent.com> Co-authored-by: Peter Steinberger <steipete@gmail.com>	2026-05-31 00:00:19 +01:00
Vincent Koc	27b15a19e8	refactor(voice): catalog voice models through providers (#87794 ) * refactor(providers): catalog voice models * feat(tts): route speech through voice models * refactor(tts): rename speaker selection fields * refactor(tts): mark default speech models * test(tts): type migrated speaker config assertions * refactor(providers): avoid catalog merge map spread * fix(tts): honor voice model fallbacks * refactor(tts): move speech core into package * chore(tts): register speech core knip workspace * fix(tts): show migrated speaker voice in status * fix(tts): satisfy speech core lint * fix(tts): preserve explicit model aliases * test(tts): narrow provider config assertion * test(doctor): allow slow commitments repair check --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>	2026-05-29 04:46:45 +01:00
fuller-stack-dev	65471a2da6	feat: add xai oauth web search and provider timeouts	2026-05-22 08:49:53 +01:00
Peter Steinberger	694ca50e97	Revert "refactor: move runtime state to SQLite" This reverts commit `f91de52f0d`.	2026-05-13 13:33:38 +01:00
Peter Steinberger	f91de52f0d	refactor: move runtime state to SQLite * refactor: remove stale file-backed shims * fix: harden sqlite state ci boundaries * refactor: store matrix idb snapshots in sqlite * fix: satisfy rebased CI guardrails * refactor: store current conversation bindings in sqlite table * refactor: store tui last sessions in sqlite table * refactor: reset sqlite schema history * refactor: drop unshipped sqlite table migration * refactor: remove plugin index file rollback * refactor: drop unshipped sqlite sidecar migrations * refactor: remove runtime commitments kv migration * refactor: preserve kysely sync result types * refactor: drop unshipped sqlite schema migration table * test: keep session usage coverage sqlite-backed * refactor: keep sqlite migration doctor-only * refactor: isolate device legacy imports * refactor: isolate push voicewake legacy imports * refactor: isolate remaining runtime legacy imports * refactor: tighten sqlite migration guardrails * test: cover sqlite persisted enum parsing * refactor: isolate legacy update and tui imports * refactor: tighten sqlite state ownership * refactor: move legacy imports behind doctor * refactor: remove legacy session row lookup * refactor: canonicalize memory transcript locators * refactor: drop transcript path scope fallbacks * refactor: drop runtime legacy session delivery pruning * refactor: store tts prefs only in sqlite * refactor: remove cron store path runtime * refactor: use cron sqlite store keys * refactor: rename telegram message cache scope * refactor: read memory dreaming status from sqlite * refactor: rename cron status store key * refactor: stop remembering transcript file paths * test: use sqlite locators in agent fixtures * refactor: remove file-shaped commitments and cron store surfaces * refactor: keep compaction transcript handles out of session rows * refactor: derive transcript handles from session identity * refactor: derive runtime transcript handles * refactor: remove gateway session locator reads * refactor: remove transcript locator from session rows * refactor: store raw stream diagnostics in sqlite * refactor: remove file-shaped transcript rotation * refactor: hide legacy trajectory paths from runtime * refactor: remove runtime transcript file bridges * refactor: repair database-first rebase fallout * refactor: align tests with database-first state * refactor: remove transcript file handoffs * refactor: sync post-compaction memory by transcript scope * refactor: run codex app-server sessions by id * refactor: bind codex runtime state by session id * refactor: pass memory transcripts by sqlite scope * refactor: remove transcript locator cleanup leftovers * test: remove stale transcript file fixtures * refactor: remove transcript locator test helper * test: make cron sqlite keys explicit * test: remove cron runtime store paths * test: remove stale session file fixtures * test: use sqlite cron keys in diagnostics * refactor: remove runtime delivery queue backfill * test: drop fake export session file mocks * refactor: rename acp session read failure flag * refactor: rename acp row session key * refactor: remove session store test seams * refactor: move legacy session parser tests to doctor * refactor: reindex managed memory in place * refactor: drop stale session store wording * refactor: rename session row helpers * refactor: rename sqlite session entry modules * refactor: remove transcript locator leftovers * refactor: trim file-era audit wording * refactor: clean managed media through sqlite * fix: prefer explicit agent for exports * fix: use prepared agent for session resets * fix: canonicalize legacy codex binding import * test: rename state cleanup helper * docs: align backup docs with sqlite state * refactor: drop legacy Pi usage auth fallback * refactor: move legacy auth profile imports to doctor * refactor: keep Pi model discovery auth in memory * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime * refactor: remove model json compatibility aliases * refactor: store auth profiles in sqlite * refactor: seed copied auth profiles in sqlite * refactor: make auth profile runtime sqlite-addressed * refactor: migrate hermes secrets into sqlite auth store * refactor: move plugin install config migration to doctor * refactor: rename plugin index audit checks * test: drop auth file assumptions * test: remove legacy transcript file assertions * refactor: drop legacy cli session aliases * refactor: store skill uploads in sqlite * refactor: keep subagent attachments in sqlite vfs * refactor: drop subagent attachment cleanup state * refactor: move legacy session aliases to doctor * refactor: require node 24 for sqlite state runtime * refactor: move provider caches into sqlite state * fix: harden virtual agent filesystem * refactor: enforce database-first runtime state * refactor: rename compaction transcript rotation setting * test: clean sqlite refactor test types * refactor: consolidate sqlite runtime state * refactor: model session conversations in sqlite * refactor: stop deriving cron delivery from session keys * refactor: stop classifying sessions from key shape * refactor: hydrate announce targets from typed delivery * refactor: route heartbeat delivery from typed sqlite context * refactor: tighten typed sqlite session routing * refactor: remove session origin routing shadow * refactor: drop session origin shadow fixtures * perf: query sqlite vfs paths by prefix * refactor: use typed conversation metadata for sessions * refactor: prefer typed session routing metadata * refactor: require typed session routing metadata * refactor: resolve group tool policy from typed sessions * refactor: delete dead session thread info bridge * Show Codex subscription reset times in channel errors (#80456) * feat(plugin-sdk): consolidate session workflow APIs * fix(agents): allow read-only agent mount reads * [codex] refresh plugin regression fixtures * fix(agents): restore compaction gateway logs * test: tighten gateway startup assertions * Redact persisted secret-shaped payloads [AI] (#79006) * test: tighten device pair notify assertions * test: tighten hermes secret assertions * test: assert matrix client error shapes * test: assert config compat warnings * fix(heartbeat): remap cron-run exec events to session keys (#80214) * fix(codex): route btw through native side threads * fix(auth): accept friendly OpenAI order for Codex profiles * fix(codex): rotate auth profiles inside harness * fix: keep browser status page probe within timeout * test: assert agents add outputs * test: pin cron read status * fix(agents): avoid Pi resource discovery stalls Co-authored-by: dataCenter430 <titan032000@gmail.com> * fix: retire timed-out codex app-server clients * test: tighten qa lab runtime assertions * test: check security fix outputs * test: verify extension runtime messages * feat(wake): expose typed sessionKey on wake protocol + system event CLI * fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790) * test: guard talk consult call helper * fix(codex): scale context engine projection (#80761) * fix(codex): scale context engine projection * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * chore: align Codex projection changelog * chore: realign Codex projection changelog * fix: isolate Codex projection patch --------- Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> * refactor: move agent runtime state toward piless * refactor: remove cron session reaper * refactor: move session management to sqlite * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: remove stale file-backed shims * test: harden kysely type coverage # Conflicts: # .agents/skills/kysely-database-access/SKILL.md # src/infra/kysely-sync.types.test.ts # src/proxy-capture/store.sqlite.test.ts # src/state/openclaw-agent-db.test.ts # src/state/openclaw-state-db.test.ts * refactor: remove cron store path runtime * refactor: keep compaction transcript handles out of session rows * refactor: derive embedded transcripts from sqlite identity * refactor: remove embedded transcript locator handoff * refactor: remove runtime transcript file bridges * refactor: remove transcript file handoffs * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime # Conflicts: # docs/cli/secrets.md # docs/gateway/authentication.md # docs/gateway/secrets.md * fix: keep oauth sibling sync sqlite-local # Conflicts: # src/commands/onboard-auth.test.ts * refactor: remove task session store maintenance # Conflicts: # src/commands/tasks.ts * refactor: keep diagnostics in state sqlite * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * Show Codex subscription reset times in channel errors (#80456) * fix(codex): refresh subscription limit resets * fix(codex): format reset times for channels * Update CHANGELOG with latest changes and fixes Updated CHANGELOG with recent fixes and improvements. * fix(codex): keep command load failures on codex surface * fix(codex): format account rate limits as rows * fix(codex): summarize account limits as usage status * fix(codex): simplify account limit status * test: tighten subagent announce queue assertion * test: tighten session delete lifecycle assertions * test: tighten cron ops assertions * fix: track cron execution milestones * test: tighten hermes secret assertions * test: assert matrix sync store payloads * test: assert config compat warnings * fix(codex): align btw side thread semantics * fix(codex): honor codex fallback blocking * fix(agents): avoid Pi resource discovery stalls * test: tighten codex event assertions * test: tighten cron assertions * Fix Codex app-server OAuth harness auth * refactor: move agent runtime state toward piless * refactor: move device and push state to sqlite * refactor: move runtime json state imports to doctor * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: clarify cron sqlite store keys * refactor: remove stale file-backed shims * refactor: bind codex runtime state by session id * test: expect sqlite trajectory branch export * refactor: rename session row helpers * fix: keep legacy device identity import in doctor * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * build: align pi contract wrappers * chore: repair database-first rebase * refactor: remove session file test contracts * test: update gateway session expectations * refactor: stop routing from session compatibility shadows * refactor: stop persisting session route shadows * refactor: use typed delivery context in clients * refactor: stop echoing session route shadows * refactor: repair embedded runner rebase imports # Conflicts: # src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts * refactor: align pi contract imports * refactor: satisfy kysely sync helper guard * refactor: remove file transcript bridge remnants * refactor: remove session locator compatibility * refactor: remove session file test contracts * refactor: keep rebase database-first clean * refactor: remove session file assumptions from e2e * docs: clarify database-first goal state * test: remove legacy store markers from sqlite runtime tests * refactor: remove legacy store assumptions from runtime seams * refactor: align sqlite runtime helper seams * test: update memory recall sqlite audit mock * refactor: align database-first runtime type seams * test: clarify doctor cron legacy store names * fix: preserve sqlite session route projections * test: fix copilot token cache test syntax * docs: update database-first proof status * test: align database-first test fixtures * docs: update database-first proof status * refactor: clean extension database-first drift * test: align agent session route proof * test: clarify doctor legacy path fixtures * chore: clean database-first changed checks * chore: repair database-first rebase markers * build: allow baileys git subdependency * chore: repair exp-vfs rebase drift * chore: finish exp-vfs rebase cleanup * chore: satisfy rebase lint drift * chore: fix qqbot rebase type seam * chore: fix rebase drift leftovers * fix: keep auth profile oauth secrets out of sqlite * fix: repair rebase drift tests * test: stabilize pairing request ordering * test: use source manifests in plugin contract checks * fix: restore gateway session metadata after rebase * fix: repair database-first rebase drift * fix: clean up database-first rebase fallout * test: stabilize line quick reply receipt time * fix: repair extension rebase drift * test: keep transcript redaction tests sqlite-backed * fix: carry injected transcript redaction through sqlite * chore: clean database branch rebase residue * fix: repair database branch CI drift * fix: repair database branch CI guard drift * fix: stabilize oauth tls preflight test * test: align database branch fast guards * test: repair build artifact boundary guards * chore: clean changelog rebase markers --------- Co-authored-by: pashpashpash <nik@vault77.ai> Co-authored-by: Eva <eva@100yen.org> Co-authored-by: stainlu <stainlu@newtype-ai.org> Co-authored-by: Jason Zhou <jason.zhou.design@gmail.com> Co-authored-by: Ruben Cuevas <hi@rubencu.com> Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com> Co-authored-by: Shakker <shakkerdroid@gmail.com> Co-authored-by: Kaspre <36520309+Kaspre@users.noreply.github.com> Co-authored-by: dataCenter430 <titan032000@gmail.com> Co-authored-by: Kaspre <kaspre@gmail.com> Co-authored-by: pandadev66 <nova.full.stack@outlook.com> Co-authored-by: Eva <admin@100yen.org> Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> Co-authored-by: jeffjhunter <support@aipersonamethod.com>	2026-05-13 13:15:12 +01:00
Vincent Koc	91ed1604b0	docs(imessage): make imsg the supported setup path	2026-05-07 12:53:01 -07:00
Peter Steinberger	5aefe6abd6	feat: stream elevenlabs tts into discord voice	2026-05-07 06:47:31 +01:00
Peter Steinberger	24853ced11	docs: outline unified talk API	2026-05-06 02:39:15 +01:00
Peter Steinberger	2dfa4b082a	docs: sync docs with source truth	2026-05-02 21:45:03 +01:00
Peter Steinberger	c02605253d	fix: require explicit TTS intent	2026-05-02 03:16:57 +01:00
Peter Steinberger	5e3265b09b	feat: support openai tts extra body	2026-05-01 22:57:35 +01:00
Peter Steinberger	0294aebe6f	feat(providers): add DeepInfra provider plugin (#73038 ) * feat(providers): add DeepInfra provider plugin * feat(deepinfra): add media provider surfaces * fix(deepinfra): satisfy provider boundary checks * docs: add gitcrawl maintainer skill * test: include deepinfra in live media sweeps * fix: remove stale tts contract import	2026-04-28 01:12:54 +01:00
Peter Steinberger	d419fb561d	feat(tts): resolve channel account config generically	2026-04-26 08:10:36 +01:00
Peter Steinberger	d613c8e29b	refactor(tts): resolve voice delivery from channel capabilities	2026-04-26 07:03:25 +01:00
Vincent Koc	724e92505a	docs(tts): add sidebarTitle 'Text to speech (TTS)' for the nav Default sidebar label fell back to title 'Text-to-speech', which is fine on the page header but readers scanning the Tools sidebar look for the acronym 'TTS'. Add a sidebarTitle so Mintlify renders 'Text to speech (TTS)' in the sidebar while keeping the canonical page title intact. Sentence case matches the rest of the Tools sidebar group (e.g. 'Image generation', 'Music generation', 'Video generation').	2026-04-25 22:11:31 -07:00
Vincent Koc	fbd6b3ce3c	docs(tts): A-Z order providers and add tools/tts to Tools nav group - docs/tools/tts.md: alphabetize providers in three places that listed them: the supported-providers table (Azure Speech ... Xiaomi MiMo), the configuration Tabs (12 provider presets in A-Z), and the field reference AccordionGroup. Top-level fields stay first; provider tabs/accordions follow strict alphabetical order. Wording, schema, and defaults unchanged. - docs/docs.json: add tools/tts to the main Tools sidebar group (slotted between trajectory and video-generation, matching the alphabetical neighborhood with image-generation, music-generation, video-generation). Previously tts only appeared under Nodes > Media capabilities, which was a discoverability gap for readers looking for TTS alongside the other generation tools.	2026-04-25 22:05:46 -07:00
Vincent Koc	71b79f49ad	docs(tts): rewrite tts.md around personas with Mintlify components The TTS doc had grown to 1008 lines with 11 separate flat 'X primary' config blocks, a 100-line dense 'Notes on fields' bullet list, and the new provider-personas feature (#70748) buried near the bottom. Restructure for readability and feature visibility: - Lead with a Steps-based 'Quick start' so first-time readers can enable TTS in 4 explicit steps. - Replace the 13-bullet provider list with a single 'Supported providers' table that names auth env vars and per-provider notes inline. Add a Warning callout for the Microsoft/edge legacy alias. - Collapse the 11 'X primary' config blocks into one Tabs component ('OpenAI + ElevenLabs', 'Google Gemini', 'Azure Speech', 'Microsoft (no key)', 'MiniMax', 'Inworld', 'xAI', 'Volcengine', 'Xiaomi MiMo', 'OpenRouter', 'Gradium', 'Local CLI') so users see one preset at a time and the page is scannable. - Promote 'Personas' to its own top-level section with two examples (minimal and the Alfred provider-neutral persona), and add a new 'How providers use persona prompts' AccordionGroup covering Google (promptTemplate audio-profile-v1, personaPrompt), OpenAI (instructions auto-mapping), and Other providers, plus a fallback policy table. - Note that agents.list[].tts.persona overrides global persona per-agent (covers the recent feat(tts) per-agent voice-override work). - Convert the 100-line 'Notes on fields' wall into a per-provider AccordionGroup using ParamField, so the field reference is scannable and field types/defaults are visually distinct. - Sentence-case headings, drop redundant body H1, fold the flow diagram inline with Auto-TTS behavior, and refresh the Output formats section to a table-first layout. - Schema fields (label/description/provider/fallbackPolicy/prompt with profile/scene/sampleContext/style/accent/pacing/constraints and providers map) verified against src/config/types.tts.ts; all defaults and env-var fallbacks preserved verbatim. Net diff: 585 insertions, 684 deletions across the same surface area.	2026-04-25 22:00:19 -07:00
Barron Roth	0594fa3c4d	TTS: add provider personas	2026-04-26 09:42:38 +05:30
Peter Steinberger	2c8c79de5c	fix(tts): normalize streamed tts voice media	2026-04-26 04:28:19 +01:00
Peter Steinberger	a91baa16de	fix(tts): honor explicit directive providers	2026-04-26 04:14:48 +01:00
Peter Steinberger	cf834e2a21	fix(tts): clean streamed directive text	2026-04-26 04:09:56 +01:00
Peter Steinberger	7a85c1a822	fix(tts): surface voice status and harden providers	2026-04-26 03:51:30 +01:00
Peter Steinberger	97ae1c7c2e	feat(tts): add read-latest voice command	2026-04-26 03:44:44 +01:00
Peter Steinberger	6855b33255	docs(tts): clarify WhatsApp voice-note delivery	2026-04-26 03:28:51 +01:00
Peter Steinberger	9b91040053	fix(tts): route WhatsApp MP3 TTS as voice notes	2026-04-26 03:26:00 +01:00
Peter Steinberger	9b4f0779ce	fix(tts): honor per-agent config in tts commands	2026-04-26 03:12:30 +01:00
Peter Steinberger	0ca952cdd5	feat(tts): add per-agent voice overrides	2026-04-26 02:54:13 +01:00
Peter Steinberger	5b80d0c15e	feat(tts): add Azure Speech provider Co-authored-by: Leon Chui <84605354+leonchui@users.noreply.github.com>	2026-04-26 01:42:51 +01:00
Rui Xu	1531123d35	feat(tts): add BytePlus Seed Speech provider Add Volcengine/BytePlus Seed Speech as a bundled TTS provider with current API-key auth, legacy AppID/token fallback, native Ogg/Opus voice-note output, and MP3 audio-file output. Co-authored-by: Peter Steinberger <steipete@gmail.com>	2026-04-25 23:46:04 +01:00
Cale Shapera	0bcb4c95c1	feat(tts): add Inworld speech provider (#55972 ) Adds the bundled Inworld speech provider with docs, config surface, SSRF-guarded fetches, directive overrides, native voice-note/telephony output coverage, and live `.profile` verification. Co-authored-by: cshape <cshape@users.noreply.github.com>	2026-04-25 22:33:21 +01:00
Peter Steinberger	e2fd3dcee9	fix(google): emit opus voice-note tts	2026-04-25 21:33:33 +01:00
Peter Steinberger	9ffe764416	fix(whatsapp): send voice note text separately	2026-04-25 18:55:03 +01:00
Peter Steinberger	b511250e5c	feat(media): add voice conversion and speech plugins	2026-04-25 12:12:33 +01:00
Peter Steinberger	a7604f8170	fix(minimax): support token plan tts auth	2026-04-25 10:36:12 +01:00
Peter Steinberger	ec8dbc4595	feat(tts): add xiaomi mimo speech provider	2026-04-25 09:48:05 +01:00
Peter Steinberger	b0c55eb659	fix(feishu): transcode voice TTS audio	2026-04-25 09:26:42 +01:00
Peter Steinberger	8acc92c881	feat(google): support Gemini TTS style profile	2026-04-25 06:11:23 +01:00
Peter Steinberger	e31aef7e19	fix(tts): migrate legacy edge config in doctor	2026-04-25 05:55:54 +01:00
Peter Steinberger	c03e5b3c3a	docs(tts): clarify legacy provider migration	2026-04-25 05:01:09 +01:00
Peter Steinberger	978a50a3c5	fix(minimax): normalize tts pitch for api	2026-04-25 04:58:20 +01:00
Peter Steinberger	225ff9a866	fix(minimax): transcode voice-note tts to opus	2026-04-25 04:52:25 +01:00
Peter Steinberger	7875092f4d	feat(openrouter): add tts provider	2026-04-25 04:36:49 +01:00
Laurent Mazare	d7e2939791	feat: add Gradium text-to-speech provider (#64958 ) Adds the Gradium bundled plugin with TTS and speech-provider registration, docs, label routing, and focused/live coverage. Also carries the current main lint cleanup needed for the rebased CI lane. Co-authored-by: laurent <laurent.mazare@gmail.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-24 18:43:53 +01:00
Peter Steinberger	f0a7a85e7a	feat(agents): add generation tool timeouts	2026-04-24 00:05:38 +01:00
Vincent Koc	789e71cdb8	docs: remove H1 on pages where frontmatter + summary already cover the parenthetical	2026-04-23 15:47:48 -07:00
Vincent Koc	6667f66fd8	docs(tools): add Related sections and unify See also to Related	2026-04-23 15:41:56 -07:00
Vincent Koc	2777b089b5	docs: normalize frontmatter titles to sentence case	2026-04-23 13:15:17 -07:00
KateWilkins	f342da5fcc	feat: add xai media providers Add xAI image generation and text-to-speech provider support with docs, live tests, and guarded provider HTTP handling.\n\nThanks @KateWilkins.	2026-04-23 00:07:39 +01:00
Barron Roth	bf59917cd1	fix: add Google Gemini TTS provider (#67515 ) (thanks @barronlroth) * Add Google Gemini TTS provider * Remove committed planning artifact * Explain Google media provider type shape * google: distill Gemini TTS provider * fix: add Google Gemini TTS provider (#67515) (thanks @barronlroth) * fix: honor cfg-backed Google TTS selection (#67515) (thanks @barronlroth) * fix: narrow Google TTS directive aliases (#67515) (thanks @barronlroth) --------- Co-authored-by: Ayaan Zaidi <hi@obviy.us>	2026-04-16 11:54:35 +05:30

1 2

60 Commits