mirror of
https://github.com/openclaw/openclaw.git
synced 2026-03-13 19:10:39 +00:00
182 lines
4.8 KiB
Markdown
182 lines
4.8 KiB
Markdown
# @openclaw/voice-call
|
||
|
||
Official Voice Call plugin for **OpenClaw**.
|
||
|
||
Providers:
|
||
|
||
- **Twilio** (Programmable Voice + Media Streams)
|
||
- **Telnyx** (Call Control v2)
|
||
- **Plivo** (Voice API + XML transfer + GetInput speech)
|
||
- **Mock** (dev/no network)
|
||
|
||
Docs: `https://docs.openclaw.ai/plugins/voice-call`
|
||
Plugin system: `https://docs.openclaw.ai/plugin`
|
||
|
||
## Install (local dev)
|
||
|
||
### Option A: install via OpenClaw (recommended)
|
||
|
||
```bash
|
||
openclaw plugins install @openclaw/voice-call
|
||
```
|
||
|
||
Restart the Gateway afterwards.
|
||
|
||
### Option B: copy into your global extensions folder (dev)
|
||
|
||
```bash
|
||
mkdir -p ~/.openclaw/extensions
|
||
cp -R extensions/voice-call ~/.openclaw/extensions/voice-call
|
||
cd ~/.openclaw/extensions/voice-call && pnpm install
|
||
```
|
||
|
||
## Config
|
||
|
||
Put under `plugins.entries.voice-call.config`:
|
||
|
||
```json5
|
||
{
|
||
provider: "twilio", // or "telnyx" | "plivo" | "mock"
|
||
fromNumber: "+15550001234",
|
||
toNumber: "+15550005678",
|
||
|
||
twilio: {
|
||
accountSid: "ACxxxxxxxx",
|
||
authToken: "your_token",
|
||
},
|
||
|
||
telnyx: {
|
||
apiKey: "KEYxxxx",
|
||
connectionId: "CONNxxxx",
|
||
// Telnyx webhook public key from the Telnyx Mission Control Portal
|
||
// (Base64 string; can also be set via TELNYX_PUBLIC_KEY).
|
||
publicKey: "...",
|
||
},
|
||
|
||
plivo: {
|
||
authId: "MAxxxxxxxxxxxxxxxxxxxx",
|
||
authToken: "your_token",
|
||
},
|
||
|
||
// Webhook server
|
||
serve: {
|
||
port: 3334,
|
||
path: "/voice/webhook",
|
||
},
|
||
|
||
// Public exposure (pick one):
|
||
// publicUrl: "https://example.ngrok.app/voice/webhook",
|
||
// tunnel: { provider: "ngrok" },
|
||
// tailscale: { mode: "funnel", path: "/voice/webhook" }
|
||
|
||
outbound: {
|
||
defaultMode: "notify", // or "conversation"
|
||
},
|
||
|
||
streaming: {
|
||
enabled: true,
|
||
streamPath: "/voice/stream",
|
||
preStartTimeoutMs: 5000,
|
||
maxPendingConnections: 32,
|
||
maxPendingConnectionsPerIp: 4,
|
||
maxConnections: 128,
|
||
},
|
||
}
|
||
```
|
||
|
||
Notes:
|
||
|
||
- Twilio/Telnyx/Plivo require a **publicly reachable** webhook URL.
|
||
- `mock` is a local dev provider (no network calls).
|
||
- Telnyx requires `telnyx.publicKey` (or `TELNYX_PUBLIC_KEY`) unless `skipSignatureVerification` is true.
|
||
- `tunnel.allowNgrokFreeTierLoopbackBypass: true` allows Twilio webhooks with invalid signatures **only** when `tunnel.provider="ngrok"` and `serve.bind` is loopback (ngrok local agent). Use for local dev only.
|
||
|
||
Streaming security defaults:
|
||
|
||
- `streaming.preStartTimeoutMs` closes sockets that never send a valid `start` frame.
|
||
- `streaming.maxPendingConnections` caps total unauthenticated pre-start sockets.
|
||
- `streaming.maxPendingConnectionsPerIp` caps unauthenticated pre-start sockets per source IP.
|
||
- `streaming.maxConnections` caps total open media stream sockets (pending + active).
|
||
|
||
## Stale call reaper
|
||
|
||
Use `staleCallReaperSeconds` to end calls that never receive a terminal webhook
|
||
(for example, notify-mode calls that never complete). The default is `0`
|
||
(disabled).
|
||
|
||
Recommended ranges:
|
||
|
||
- **Production:** `120`–`300` seconds for notify-style flows.
|
||
- Keep this value **higher than `maxDurationSeconds`** so normal calls can
|
||
finish. A good starting point is `maxDurationSeconds + 30–60` seconds.
|
||
|
||
Example:
|
||
|
||
```json5
|
||
{
|
||
staleCallReaperSeconds: 360,
|
||
}
|
||
```
|
||
|
||
## TTS for calls
|
||
|
||
Voice Call uses the core `messages.tts` configuration (OpenAI or ElevenLabs) for
|
||
streaming speech on calls. You can override it under the plugin config with the
|
||
same shape — overrides deep-merge with `messages.tts`.
|
||
|
||
```json5
|
||
{
|
||
tts: {
|
||
provider: "openai",
|
||
openai: {
|
||
voice: "alloy",
|
||
},
|
||
},
|
||
}
|
||
```
|
||
|
||
Notes:
|
||
|
||
- Edge TTS is ignored for voice calls (telephony audio needs PCM; Edge output is unreliable).
|
||
- Core TTS is used when Twilio media streaming is enabled; otherwise calls fall back to provider native voices.
|
||
|
||
## CLI
|
||
|
||
```bash
|
||
openclaw voicecall call --to "+15555550123" --message "Hello from OpenClaw"
|
||
openclaw voicecall continue --call-id <id> --message "Any questions?"
|
||
openclaw voicecall speak --call-id <id> --message "One moment"
|
||
openclaw voicecall end --call-id <id>
|
||
openclaw voicecall status --call-id <id>
|
||
openclaw voicecall tail
|
||
openclaw voicecall expose --mode funnel
|
||
```
|
||
|
||
## Tool
|
||
|
||
Tool name: `voice_call`
|
||
|
||
Actions:
|
||
|
||
- `initiate_call` (message, to?, mode?)
|
||
- `continue_call` (callId, message)
|
||
- `speak_to_user` (callId, message)
|
||
- `end_call` (callId)
|
||
- `get_status` (callId)
|
||
|
||
## Gateway RPC
|
||
|
||
- `voicecall.initiate` (to?, message, mode?)
|
||
- `voicecall.continue` (callId, message)
|
||
- `voicecall.speak` (callId, message)
|
||
- `voicecall.end` (callId)
|
||
- `voicecall.status` (callId)
|
||
|
||
## Notes
|
||
|
||
- Uses webhook signature verification for Twilio/Telnyx/Plivo.
|
||
- Adds replay protection for Twilio and Plivo webhooks (valid duplicate callbacks are ignored safely).
|
||
- Twilio speech turns include a per-turn token so stale/replayed callbacks cannot complete a newer turn.
|
||
- `responseModel` / `responseSystemPrompt` control AI auto-responses.
|
||
- Media streaming requires `ws` and OpenAI Realtime API key.
|