Files
openclaw/apps/macos/Sources/OpenClaw/TalkModeController.swift
Seungwoo hong 63fac653ed fix(talk): Talk Mode TTS improvements for CJK languages (#53553)
* feat(talk): add distinct system sounds for each Talk Mode phase

Play a short system sound on phase transitions to give the user
audible feedback:
- thinking: Tink
- speaking: Pop
- listening (after speech interrupted): Bottle
- listening (after thinking): Submarine

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): add right Shift key to interrupt Talk Mode speech

Add TalkSpeechInterruptMonitor — a dedicated global key monitor that
listens for right Shift (keyCode 60) to interrupt Talk Mode speech.
Independent of Push-to-Talk, so it works even when PTT is disabled.

Stops only the current response; the next conversation cycle
continues normally via sendAndSpeak's resumeListeningIfNeeded flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): increase silence detection timeout for CJK locales

Korean, Japanese, and Chinese speakers need longer pauses between
phrases. When the app locale is CJK, enforce a minimum 2000ms
silence window (vs the default 1500ms) to avoid premature
transcript submission.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(talk): remove force-unwraps and log CJK silence clamp in reloadConfig

Replace non-idiomatic force-unwraps (cfg.voiceId!, cfg.modelId!) with
safe flatMap unwrapping, and add an info log when CJK locale clamps the
silence timeout so the override is observable in diagnostics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): add settings toggle to mute phase-transition sounds

Add a "Play phase-transition sounds" checkbox to Voice Wake settings.
When disabled, Talk Mode phase transitions (Tink/Pop/Bottle/Submarine)
are silent. Defaults to enabled to preserve existing behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(talk): add toggle for Right Option speech interrupt

Add a "Press Right Option to stop speech" checkbox to Voice Wake
settings. Also change the interrupt key from right Shift to right
Option (keyCode 61) to avoid conflicts with typing.
Defaults to enabled to preserve existing behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(talk): disable Push-to-Talk while Talk Mode is active

Talk Mode and Push-to-Talk both use the right Option key (keyCode 61).
Disable PTT when Talk Mode is enabled to prevent conflicting handlers,
and restore PTT when Talk Mode is disabled.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(talk): show info when PTT is paused during Talk Mode

Display a footnote under the Push-to-Talk toggle when both PTT and
Talk Mode are enabled, explaining that PTT is paused while Talk Mode
is active and resumes when Talk Mode is turned off.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fixup: SwiftFormat lint on TalkModeController phase sound switch

Resolves macos-swift CI lint failures introduced by Korean
comment formatting in 'feat(talk): add distinct system sounds for each Talk Mode phase'.
- Collapse consecutive spaces between sound name and comment
- Move floating comments above the listening case expression so
  they're at the correct indent level

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hongsw <hongsw@hongswui-Macmini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Fabian Williams <fabian@adotob.com>
2026-04-25 17:05:51 -04:00

107 lines
3.7 KiB
Swift

import AppKit
import Observation
@MainActor
@Observable
final class TalkModeController {
static let shared = TalkModeController()
private let logger = Logger(subsystem: "ai.openclaw", category: "talk.controller")
private(set) var phase: TalkModePhase = .idle
private(set) var isPaused: Bool = false
func setEnabled(_ enabled: Bool) async {
self.logger.info("talk enabled=\(enabled)")
if enabled {
TalkOverlayController.shared.present()
} else {
TalkOverlayController.shared.dismiss()
}
TalkSpeechInterruptMonitor.shared.setEnabled(enabled && AppStateStore.shared.talkShiftToStopEnabled)
// Talk Mode and Push-to-Talk share the right Option key disable PTT while Talk Mode is active.
let pttEnabled = !enabled && AppStateStore.shared.voicePushToTalkEnabled
VoicePushToTalkHotkey.shared.setEnabled(pttEnabled)
await TalkModeRuntime.shared.setEnabled(enabled)
// Resume voice wake listener *after* TalkMode audio is fully torn down.
// Check swabbleEnabled (not voiceWakeTriggersTalkMode) so the paused wake listener
// resumes even if the user toggled "Trigger Talk Mode" off during the session.
if !enabled, AppStateStore.shared.swabbleEnabled {
Task { await VoiceWakeRuntime.shared.refresh(state: AppStateStore.shared) }
}
}
func updatePhase(_ phase: TalkModePhase) {
let previousPhase = self.phase
self.phase = phase
TalkOverlayController.shared.updatePhase(phase)
// Play distinct system sounds for each phase transition.
if phase != previousPhase {
Self.playPhaseSound(phase, previousPhase: previousPhase)
}
let effectivePhase = self.isPaused ? "paused" : phase.rawValue
Task {
await GatewayConnection.shared.talkMode(
enabled: AppStateStore.shared.talkEnabled,
phase: effectivePhase)
}
}
private static func playPhaseSound(_ phase: TalkModePhase, previousPhase: TalkModePhase) {
guard AppStateStore.shared.talkPhaseSoundsEnabled else { return }
let soundName: String? = switch phase {
case .thinking:
"Tink" // :
case .speaking:
"Pop" // :
case .listening:
// (speakinglistening):
// (thinkinglistening ):
previousPhase == .speaking ? "Bottle" : "Submarine"
case .idle:
nil
}
if let soundName {
NSSound(named: NSSound.Name(soundName))?.play()
}
}
func updateLevel(_ level: Double) {
TalkOverlayController.shared.updateLevel(level)
}
func setPaused(_ paused: Bool) {
guard self.isPaused != paused else { return }
self.logger.info("talk paused=\(paused)")
self.isPaused = paused
TalkOverlayController.shared.updatePaused(paused)
let effectivePhase = paused ? "paused" : self.phase.rawValue
Task {
await GatewayConnection.shared.talkMode(
enabled: AppStateStore.shared.talkEnabled,
phase: effectivePhase)
}
Task { await TalkModeRuntime.shared.setPaused(paused) }
}
func togglePaused() {
self.setPaused(!self.isPaused)
}
func stopSpeaking(reason: TalkStopReason = .userTap) {
Task { await TalkModeRuntime.shared.stopSpeaking(reason: reason) }
}
func exitTalkMode() {
Task { await AppStateStore.shared.setTalkEnabled(false) }
}
}
enum TalkStopReason {
case userTap
case speech
case manual
}