mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-23 07:01:40 +00:00
Reject realtime voice WebSocket frames above 256 KB before JSON parsing or bridge setup, and absorb ws error events so oversized frames close the connection instead of crashing the gateway. Local verification: - pnpm test extensions/voice-call/src/webhook/realtime-handler.test.ts - pnpm check Thanks @mmaps. Co-authored-by: mmaps <3399869+mmaps@users.noreply.github.com>
@openclaw/voice-call
Official Voice Call plugin for OpenClaw.
Providers:
- Twilio (Programmable Voice + Media Streams)
- Telnyx (Call Control v2)
- Plivo (Voice API + XML transfer + GetInput speech)
- Mock (dev/no network)
Docs: https://docs.openclaw.ai/plugins/voice-call
Plugin system: https://docs.openclaw.ai/plugin
Install (local dev)
Option A: install via OpenClaw (recommended)
openclaw plugins install @openclaw/voice-call
Restart the Gateway afterwards.
Option B: copy into your global extensions folder (dev)
PLUGIN_HOME=~/.openclaw/extensions
mkdir -p "$PLUGIN_HOME"
cp -R <local-plugin-checkout> "$PLUGIN_HOME/voice-call"
cd "$PLUGIN_HOME/voice-call" && pnpm install
Config
Put under plugins.entries.voice-call.config:
{
provider: "twilio", // or "telnyx" | "plivo" | "mock"
fromNumber: "+15550001234",
toNumber: "+15550005678",
twilio: {
accountSid: "ACxxxxxxxx",
authToken: "your_token",
},
telnyx: {
apiKey: "KEYxxxx",
connectionId: "CONNxxxx",
// Telnyx webhook public key from the Telnyx Mission Control Portal
// (Base64 string; can also be set via TELNYX_PUBLIC_KEY).
publicKey: "...",
},
plivo: {
authId: "MAxxxxxxxxxxxxxxxxxxxx",
authToken: "your_token",
},
// Webhook server
serve: {
port: 3334,
path: "/voice/webhook",
},
// Public exposure (pick one):
// publicUrl: "https://example.ngrok.app/voice/webhook",
// tunnel: { provider: "ngrok" },
// tailscale: { mode: "funnel", path: "/voice/webhook" }
outbound: {
defaultMode: "notify", // or "conversation"
},
streaming: {
enabled: true,
// optional; if omitted, Voice Call picks the first registered
// realtime-transcription provider by autoSelectOrder
provider: "<realtime-transcription-provider-id>",
streamPath: "/voice/stream",
providers: {
"<realtime-transcription-provider-id>": {
// provider-owned options
},
},
preStartTimeoutMs: 5000,
maxPendingConnections: 32,
maxPendingConnectionsPerIp: 4,
maxConnections: 128,
},
}
Notes:
- Twilio/Telnyx/Plivo require a publicly reachable webhook URL.
mockis a local dev provider (no network calls).- Telnyx requires
telnyx.publicKey(orTELNYX_PUBLIC_KEY) unlessskipSignatureVerificationis true. - If older configs still use
provider: "log",twilio.from, or legacystreaming.*OpenAI keys, runopenclaw doctor --fixto rewrite them. - advanced webhook, streaming, and tunnel notes:
https://docs.openclaw.ai/plugins/voice-call responseModelis optional. When unset, voice responses use the runtime default model.
Stale call reaper
See the plugin docs for recommended ranges and production examples:
https://docs.openclaw.ai/plugins/voice-call#stale-call-reaper
TTS for calls
Voice Call uses the core messages.tts configuration for
streaming speech on calls. Override examples and provider caveats live here:
https://docs.openclaw.ai/plugins/voice-call#tts-for-calls
CLI
openclaw voicecall call --to "+15555550123" --message "Hello from OpenClaw"
openclaw voicecall continue --call-id <id> --message "Any questions?"
openclaw voicecall speak --call-id <id> --message "One moment"
openclaw voicecall end --call-id <id>
openclaw voicecall status --call-id <id>
openclaw voicecall tail
openclaw voicecall expose --mode funnel
Tool
Tool name: voice_call
Actions:
initiate_call(message, to?, mode?)continue_call(callId, message)speak_to_user(callId, message)end_call(callId)get_status(callId)
Gateway RPC
voicecall.initiate(to?, message, mode?)voicecall.continue(callId, message)voicecall.speak(callId, message)voicecall.end(callId)voicecall.status(callId)
Notes
- Uses webhook signature verification for Twilio/Telnyx/Plivo.
- Adds replay protection for Twilio and Plivo webhooks (valid duplicate callbacks are ignored safely).
- Twilio speech turns include a per-turn token so stale/replayed callbacks cannot complete a newer turn.
responseModel/responseSystemPromptcontrol AI auto-responses.- Voice-call auto-responses enforce a spoken JSON contract (
{"spoken":"..."}) and filter reasoning/meta output before playback. - While a Twilio stream is active, playback does not fall back to TwiML
<Say>; stream-TTS failures fail the playback request. - Outbound conversation calls suppress barge-in only while the initial greeting is actively speaking, then re-enable normal interruption.
- Twilio stream disconnect auto-end uses a short grace window so quick reconnects do not end the call.
- Realtime provider selection is generic. Configure
streaming.provider/realtime.providerand put provider-owned options underproviders.<id>. - Runtime fallback still accepts the old voice-call keys for now, but migration is a doctor step and the compat shim is scheduled to go away in a future release.