Commit Graph

65 Commits

Author SHA1 Message Date
Omar Shahine
da3d17e1ca fix(tts): pre-transcode synthesized audio to opus-in-CAF for native iMessage voice-memo bubbles via BlueBubbles (#72586)
End-to-end testing on macOS + BlueBubbles + ElevenLabs walked through three CAF flavors before landing on the format Apple's Messages.app actually emits when a user records a native iMessage voice memo:

- PCM int16 @ 44.1 kHz CAF: BlueBubbles' internal `afconvert -f m4af -d aac` conversion fails; the original CAF reaches iMessage but renders with 0 s duration.
- AAC @ 22.05 kHz mono CAF: BlueBubbles' conversion succeeds and the server silently downgrades the delivery, sending the converted MP3 as a generic audio attachment.
- **Opus @ 24 kHz mono CAF**: byte-identical to the descriptor block Apple's Messages.app produces; BlueBubbles passes it through unchanged and iMessage renders a native voice-memo bubble with proper duration and waveform UI.

Adds an opt-in `tts.voice.preferAudioFileFormat` channel capability and a macOS `afconvert`-backed pre-transcode in the speech-core pipeline. BlueBubbles declares `preferAudioFileFormat: "caf"`. Other channels are unaffected. Falls back to the original buffer when the host platform, the source/target pair, or the transcoder process can't produce the preferred container — so non-Darwin hosts and unsupported provider combinations are unchanged.

Also adds a `caff` magic-byte sniff in `src/media/mime.ts` so the auto-reply host-local-media validator (which uses `file-type` and didn't recognize CAF natively) accepts the buffer instead of dropping it as "⚠️ Media failed."

Fixes #72506.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:15:16 -07:00
Peter Steinberger
7829c438a6 fix(tts): keep final webchat audio supplemental 2026-04-27 20:22:18 +01:00
Peter Steinberger
4336a7f3a9 refactor(plugin-sdk): narrow config runtime imports 2026-04-27 14:58:32 +01:00
Peter Steinberger
eb1a201060 refactor: narrow media core plugin api barrels 2026-04-27 14:34:00 +01:00
Josh Avant
b3d9948c4c fix: use runtime snapshot for TTS SecretRefs (#72581)
* fix: use runtime snapshot for tts secrets

* fix: keep tts secret snapshot selection local

* docs: add tts secretref changelog entry
2026-04-27 01:02:17 -05:00
Peter Steinberger
d419fb561d feat(tts): resolve channel account config generically 2026-04-26 08:10:36 +01:00
Peter Steinberger
d613c8e29b refactor(tts): resolve voice delivery from channel capabilities 2026-04-26 07:03:25 +01:00
Peter Steinberger
6a67f65568 fix(voice): reuse preflight transcripts across channels 2026-04-26 05:42:04 +01:00
Ayaan Zaidi
cc2044633c fix(tts): compose personas with agent config 2026-04-26 09:42:38 +05:30
Barron Roth
0594fa3c4d TTS: add provider personas 2026-04-26 09:42:38 +05:30
Peter Steinberger
9b91040053 fix(tts): route WhatsApp MP3 TTS as voice notes 2026-04-26 03:26:00 +01:00
Peter Steinberger
0ca952cdd5 feat(tts): add per-agent voice overrides 2026-04-26 02:54:13 +01:00
Peter Steinberger
7fcefd56b7 chore: bump version to 2026.4.25 2026-04-25 10:31:52 +01:00
Peter Steinberger
b0c55eb659 fix(feishu): transcode voice TTS audio 2026-04-25 09:26:42 +01:00
Peter Steinberger
f0a7a85e7a feat(agents): add generation tool timeouts 2026-04-24 00:05:38 +01:00
fuller-stack-dev
276c00015c fix: add local embedded TUI mode (#66767) (thanks @fuller-stack-dev)
* feat(tui): add local embedded TUI mode with terminal/chat aliases

Adds a gateway-free local TUI path so users can run openclaw in their
terminal without needing a running gateway process.

- TuiBackend interface abstraction (tui-backend.ts) with EmbeddedTuiBackend
  implementation that drives the agent loop in-process
- openclaw tui --local flag for local embedded mode
- openclaw terminal / openclaw chat aliases that imply --local
- /auth slash command with codex CLI delegation to avoid prolite plan issue
- Default model display fallback on startup
- Local-aware status text and log suppression
- Concise auth error hints, raw HTML 403 suppression
- Onboarding hatch flow launches local TUI (no gateway required)
- Commander alias bug fix in run-main.ts (.aliases() check)
- All new and updated tests passing (145/145)

* TUI: fix alias detection, cross-platform codex lookup, and history byte-budget safeguards

* TUI: remove RuntimeEnv type annotation to fix CI oxlint error

* TUI: filter gateway-dependent tools and auto-approve plugin hooks in embedded mode

* TUI: suppress console noise and add embedded mode system prompt note

* TUI: reduce embedded-mode tool filtering from 15 to 7, add local session tools

* TUI: fix remaining PR review comments

* TUI: address latest review feedback and CI drift

* Core: align prompt helper with latest base

* Core: match prompt helper formatting with base

* Core: restore prompt helper from latest base

* fix(tui): preserve local auth fallback in source checkouts

* fix(tts): guard telephony provider invocation

* fix(tui): support Windows codex auth shim

* fix(tui): harden local auth flow

* fix: preserve embedded tool-first run events

* fix(tui): keep embedded plugin approvals gated

* fix(tui): restore embedded attempt import

* fix(tui): resolve sessions in embedded stub

* fix: add embedded TUI changelog entry (#66767) (thanks @fuller-stack-dev)

* fix: pass setup TUI local mode through relaunch (#66767) (thanks @fuller-stack-dev)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-22 10:08:57 +05:30
Peter Steinberger
6ed67fc873 test: share speech tts payload fixture 2026-04-20 17:11:33 +01:00
Peter Steinberger
8116e638f3 chore: release 2026.4.20 2026-04-20 13:16:40 +01:00
Peter Steinberger
ac8f0c9c0d chore: prepare 2026.4.19-beta.1 release 2026-04-19 02:09:43 +01:00
Peter Steinberger
3f2e73b723 chore(release): bump version to 2026.4.18 2026-04-18 15:46:33 +01:00
Peter Steinberger
0dc4c4076c chore: bump version to 2026.4.16 2026-04-17 00:45:04 +01:00
stain lu
6ea3cddf0d fix: register bundled TTS providers and route overrides correctly (#62846) (thanks @stainlu)
* fix(microsoft,elevenlabs): add enabledByDefault so speech providers register at runtime

* fix(tts): route generic directive tokens to the explicitly declared provider

Addresses the P2 Codex review on #62846 that flagged auto-enabling
ElevenLabs as a product regression for MiniMax users. Both providers
claim the generic `speed` token, and parseTtsDirectives walked
providers in autoSelectOrder with first-match-wins, so inputs like
`[[tts:provider=minimax speed=1.2]]` silently routed speed to
providerOverrides.elevenlabs once elevenlabs participated in every
parse pass.

The parser now pre-scans for `provider=` (honoring legacy last-wins
semantics) and routes generic tokens with the declared provider tried
first, falling back to autoSelectOrder when it doesn't handle the key.
Token order inside the directive no longer matters: `speed=1.2` before
or after `provider=minimax` both resolve to MiniMax.

Adds a regression test suite covering the exact ElevenLabs/MiniMax
speed collision plus fallback, mixed-token, last-wins, and
allowProvider-disabled cases. parseTtsDirectives had no prior test
coverage.

* fix(tts): prefer active provider for generic directives

* fix: register bundled TTS providers safely (#62846) (thanks @stainlu)

* fix: use exported TTS SDK seam (#62846) (thanks @stainlu)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-16 15:26:38 +05:30
Barron Roth
bf59917cd1 fix: add Google Gemini TTS provider (#67515) (thanks @barronlroth)
* Add Google Gemini TTS provider

* Remove committed planning artifact

* Explain Google media provider type shape

* google: distill Gemini TTS provider

* fix: add Google Gemini TTS provider (#67515) (thanks @barronlroth)

* fix: honor cfg-backed Google TTS selection (#67515) (thanks @barronlroth)

* fix: narrow Google TTS directive aliases (#67515) (thanks @barronlroth)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-16 11:54:35 +05:30
Peter Steinberger
b3fa5880dd build(extensions): bump bundled plugin versions to 2026.4.15-beta.1 2026-04-15 15:06:13 +01:00
Marcus Castro
403783a3b1 fix(tts): correct tagged TTS syntax guidance (#65573) 2026-04-12 19:41:13 -03:00
Peter Steinberger
a8e140e395 chore: bump version to 2026.4.12 2026-04-12 10:37:18 -07:00
Vincent Koc
b1290e61fd fix(plugin-sdk): narrow reply payload type surface 2026-04-11 21:52:31 +01:00
Vincent Koc
7d1bd0c98c fix(tts): split shared tts config types 2026-04-11 20:25:02 +01:00
Peter Steinberger
1ab6e5dbf0 chore(release): bump version to 2026.4.11 2026-04-11 04:51:17 +01:00
Peter Steinberger
f621fb4aba refactor: centralize speech voice-note channel routing 2026-04-10 15:01:19 +01:00
liuhuaize
271d3b3bdb speech-core: fix TTS regression test typing 2026-04-10 14:37:25 +01:00
liuhuaize
b3d7fd166a speech-core: route Discord auto TTS as voice notes 2026-04-10 14:37:25 +01:00
Peter Steinberger
719f06510c chore: bump version to 2026.4.10 2026-04-09 03:56:22 +01:00
Peter Steinberger
8cbd60d203 chore: prepare 2026.4.9 release 2026-04-08 08:02:53 +01:00
Peter Steinberger
4f8471617a chore: prepare 2026.4.8 2026-04-08 04:21:51 +01:00
Peter Steinberger
0e91c25c0b chore: prepare 2026.4.7 2026-04-08 02:14:59 +01:00
Peter Steinberger
775b78e186 refactor: dedupe provider lowercase helpers 2026-04-07 22:24:32 +01:00
Peter Steinberger
60d9c150b2 refactor: dedupe provider lowercase helpers 2026-04-07 15:12:32 +01:00
Tak Hoffman
97c031a8db feat: Add first-class infer CLI for inference workflows (#62129)
* refresh infer branch onto latest main

* flatten infer media commands

* fix tts runtime facade export

* validate explicit web search providers

* fix infer auth logout persistence
2026-04-07 07:11:19 -05:00
Peter Steinberger
967ecddfed refactor: dedupe extension lower readers 2026-04-07 11:18:18 +01:00
Vincent Koc
d5ed6d26e9 chore(plugins): bulk add package boundary tsconfig rollout 2026-04-07 08:48:23 +01:00
Peter Steinberger
90a45a4907 refactor: dedupe provider channel readers 2026-04-07 08:40:34 +01:00
Peter Steinberger
54cd8ed25b refactor: dedupe extension error formatting 2026-04-07 05:06:54 +01:00
Peter Steinberger
8b79cbcd06 build(plugins): align package versions to 2026.4.6 2026-04-06 17:05:30 +01:00
Peter Steinberger
af62a2c2e4 style: fix extension lint violations 2026-04-06 14:53:55 +01:00
Peter Steinberger
ce8492f9a0 chore: bump version to 2026.4.5 2026-04-05 21:33:04 +01:00
Peter Steinberger
67d6fc8847 chore(plugins): sync versions to 2026.4.4 2026-04-04 20:03:01 +01:00
Peter Steinberger
2766c27b2a refactor(plugin-sdk): genericize web channel runtime seams 2026-04-03 11:17:28 +01:00
Peter Steinberger
8988894ff7 build: prepare 2026.4.1-beta.1 release 2026-04-01 15:09:19 +01:00
Peter Steinberger
9ea7e06460 build: bump version to 2026.4.1 2026-03-31 22:53:17 +01:00