Commit Graph

23 Commits

Author SHA1 Message Date
Peter Steinberger
466f718320 feat: wire talk handoff into native nodes 2026-05-06 02:39:15 +01:00
Fuma2013
3f3ed80300 fix(macos): route Talk providers through gateway TTS
Route remote and custom macOS Talk providers through Gateway talk.speak before falling back to the system voice.\n\nThanks @Fuma2013.
2026-05-02 08:57:26 +01:00
Peter Steinberger
c1996f5d75 fix: downmix speech buffers for macos voice 2026-05-02 02:47:33 +01:00
Seungwoo hong
63fac653ed fix(talk): Talk Mode TTS improvements for CJK languages (#53553)
* feat(talk): add distinct system sounds for each Talk Mode phase

Play a short system sound on phase transitions to give the user
audible feedback:
- thinking: Tink
- speaking: Pop
- listening (after speech interrupted): Bottle
- listening (after thinking): Submarine

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): add right Shift key to interrupt Talk Mode speech

Add TalkSpeechInterruptMonitor — a dedicated global key monitor that
listens for right Shift (keyCode 60) to interrupt Talk Mode speech.
Independent of Push-to-Talk, so it works even when PTT is disabled.

Stops only the current response; the next conversation cycle
continues normally via sendAndSpeak's resumeListeningIfNeeded flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): increase silence detection timeout for CJK locales

Korean, Japanese, and Chinese speakers need longer pauses between
phrases. When the app locale is CJK, enforce a minimum 2000ms
silence window (vs the default 1500ms) to avoid premature
transcript submission.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(talk): remove force-unwraps and log CJK silence clamp in reloadConfig

Replace non-idiomatic force-unwraps (cfg.voiceId!, cfg.modelId!) with
safe flatMap unwrapping, and add an info log when CJK locale clamps the
silence timeout so the override is observable in diagnostics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): add settings toggle to mute phase-transition sounds

Add a "Play phase-transition sounds" checkbox to Voice Wake settings.
When disabled, Talk Mode phase transitions (Tink/Pop/Bottle/Submarine)
are silent. Defaults to enabled to preserve existing behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): add toggle for Right Option speech interrupt

Add a "Press Right Option to stop speech" checkbox to Voice Wake
settings. Also change the interrupt key from right Shift to right
Option (keyCode 61) to avoid conflicts with typing.
Defaults to enabled to preserve existing behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(talk): disable Push-to-Talk while Talk Mode is active

Talk Mode and Push-to-Talk both use the right Option key (keyCode 61).
Disable PTT when Talk Mode is enabled to prevent conflicting handlers,
and restore PTT when Talk Mode is disabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(talk): show info when PTT is paused during Talk Mode

Display a footnote under the Push-to-Talk toggle when both PTT and
Talk Mode are enabled, explaining that PTT is paused while Talk Mode
is active and resumes when Talk Mode is turned off.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fixup: SwiftFormat lint on TalkModeController phase sound switch

Resolves macos-swift CI lint failures introduced by Korean
comment formatting in 'feat(talk): add distinct system sounds for each Talk Mode phase'.
- Collapse consecutive spaces between sound name and comment
- Move floating comments above the listening case expression so
  they're at the correct indent level

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hongsw <hongsw@hongswui-Macmini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Fabian Williams <fabian@adotob.com>
2026-04-25 17:05:51 -04:00
Peter Steinberger
02f3e9cfa2 fix(talk): honor configured speech locale 2026-04-25 21:05:24 +01:00
Peter Steinberger
3731a7c8f2 fix(macos): retry talk tts via gateway 2026-04-25 04:09:43 +01:00
Gustavo Garcia
bb543f71d9 fix(talk): fix ensure permissions on first execution of Talk Mode in MacOS (#62459)
* fix(talk): fix ensure permissions on first execution of Talk Mode in MacOS

* macos: fix talk mode formatting

* test: fix CI shard regressions

* docs: add talk mode changelog

---------

Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
2026-04-11 18:08:45 +10:00
Luke
7c72b694f1 macOS: add MLX Talk provider MVP (#63539)
Merged via squash.

Prepared head SHA: da43563513
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Co-authored-by: ImLukeF <92253590+ImLukeF@users.noreply.github.com>
Reviewed-by: @ImLukeF
2026-04-09 17:13:34 +10:00
Phillip Hampton
350fe63bbf feat(macos): Voice Wake option to trigger Talk Mode (#58490)
* feat(macos): add "Trigger Talk Mode" option to Voice Wake settings

When enabled, detecting a wake phrase activates Talk Mode (full voice
conversation: STT -> LLM -> TTS playback) instead of sending a text
message to the chat. Enables hands-free voice assistant UX.

Implementation:
- Constants.swift: new `openclaw.voiceWakeTriggersTalkMode` defaults key
- AppState.swift: new property with UserDefaults persistence + runtime
  refresh on change so the toggle takes effect immediately
- VoiceWakeSettings.swift: "Trigger Talk Mode" toggle under Voice Wake,
  disabled when Voice Wake is off
- VoiceWakeRuntime.swift: `beginCapture` checks `triggersTalkMode` and
  activates Talk Mode directly, skipping the capture/overlay/forward
  flow. Pauses the wake listener via `pauseForPushToTalk()` to prevent
  two audio pipelines competing on the mic.
- TalkModeController.swift: resumes voice wake listener when Talk Mode
  exits by calling `VoiceWakeRuntime.refresh`, mirroring PTT teardown.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review feedback

- TalkModeController: move VoiceWakeRuntime.refresh() after
  TalkModeRuntime.setEnabled(false) completes, preventing mic
  contention race where wake listener restarts before Talk Mode
  finishes tearing down its audio engine (P1)
- VoiceWakeRuntime: remove redundant haltRecognitionPipeline()
  before pauseForPushToTalk() which already calls it via stop() (P2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resume wake listener based on enabled state, not toggle value

Check swabbleEnabled instead of voiceWakeTriggersTalkMode when deciding
whether to resume the wake listener after Talk Mode exits. This ensures
the paused listener resumes even if the user toggled "Trigger Talk Mode"
off during an active session. (P2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: play trigger chime on Talk Mode activation + send chime before TTS

- VoiceWakeRuntime: play the configured trigger chime in the
  triggersTalkMode branch before pausing the wake listener. The early
  return was skipping the chime that plays in the normal capture flow.
- TalkModeRuntime: play the Voice Wake "send" chime before TTS playback
  starts, giving audible feedback that the AI is about to respond. Talk
  Mode never used this chime despite it being configurable in settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: move send chime to transcript finalization (on send, not on reply)

The send chime now plays when the user's speech is finalized and about
to be sent to the AI, not when the TTS response starts. This matches
the semantics of "send sound" -- it confirms your input was captured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:09:36 -04:00
Seungwoo hong
138a92373b fix(talk): prevent double TTS playback when system voice times out (#53511)
Merged via squash.

Prepared head SHA: 864d556fa6
Co-authored-by: hongsw <1100974+hongsw@users.noreply.github.com>
Co-authored-by: grp06 <1573959+grp06@users.noreply.github.com>
Reviewed-by: @grp06
2026-03-26 15:37:40 -07:00
Peter Steinberger
16a5f0b006 refactor: split talk gateway config loaders 2026-03-08 16:22:48 +00:00
Peter Steinberger
8d3d742c6a refactor: require canonical talk resolved payload 2026-03-08 16:22:48 +00:00
Peter Steinberger
b4c8950417 refactor: centralize talk silence timeout defaults 2026-03-08 14:58:29 +00:00
Peter Steinberger
4f482d2a2b refactor: share Apple talk config parsing 2026-03-08 14:58:29 +00:00
dano does design
6ff7e8f42e talk: add configurable silence timeout 2026-03-08 14:30:25 +00:00
Peter Steinberger
d5b305b250 fix: follow up #39321 and #38445 landings 2026-03-08 13:58:13 +00:00
Dmitri
d2347ed825 macOS: set speech recognition taskHint for Talk Mode mic capture
Add taskHint = .dictation to Talk Mode's SFSpeechAudioBufferRecognitionRequest,
matching what Voice Wake already sets. Without this hint the recognizer may not
properly initialize audio capture, causing Talk Mode to appear unresponsive.

Co-Authored-By: dmiv <dmiv@users.noreply.github.com>
2026-03-08 13:52:08 +00:00
Peter Steinberger
ce1dbeb986 fix(macos): clean warnings and harden gateway/talk config parsing 2026-02-25 00:27:36 +00:00
Peter Steinberger
236b22b6a2 fix(macos): guard voice audio paths with no input device (#25817)
Co-authored-by: Stefan Förster <103369858+sfo2001@users.noreply.github.com>
2026-02-25 00:10:14 +00:00
Nimrod Gutman
d58f71571a feat(talk): add provider-agnostic config with legacy compatibility 2026-02-24 15:02:52 +00:00
Peter Steinberger
8725c2b19f style(swift): run swiftformat + swiftlint autocorrect 2026-02-15 05:38:35 +01:00
Sk Akram
4c86821aca fix: allow device-paired clients to retrieve TTS API keys (#14613)
* refactor: add config.get to READ_METHODS set

* refactor(gateway): scope talk secrets via talk.config

* fix: resolve rebase conflicts for talk scope refactor

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-13 17:07:49 +01:00
Peter Steinberger
9a7160786a refactor: rename to openclaw 2026-01-30 03:16:21 +01:00