openclaw/docs/providers/deepgram.md at b885aa7cd37340d2babbbae454cf15b85a95126a

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-06 09:20:43 +00:00

Files

Vincent Koc 1042b893f6 docs: drop parenthetical H1s across gateway, channels, providers, concepts, and reference pages

2026-04-24 10:49:23 -07:00

6.5 KiB

Raw Blame History

summary, read_when, title

summary

read_when

title

Deepgram transcription for inbound voice notes

You want Deepgram speech-to-text for audio attachments

You want Deepgram streaming transcription for Voice Call

You need a quick Deepgram config example

Deepgram

Deepgram is a speech-to-text API. In OpenClaw it is used for inbound audio/voice-note transcription through tools.media.audio and for Voice Call streaming STT through plugins.entries.voice-call.config.streaming.

For batch transcription, OpenClaw uploads the complete audio file to Deepgram and injects the transcript into the reply pipeline ({{Transcript}} + [Audio] block). For Voice Call streaming, OpenClaw forwards live G.711 u-law frames over Deepgram's WebSocket listen endpoint and emits partial or final transcripts as Deepgram returns them.

Detail	Value
Website	deepgram.com
Docs	developers.deepgram.com
Auth	`DEEPGRAM_API_KEY`
Default model	`nova-3`

Getting started

Add your Deepgram API key to the environment:

```
DEEPGRAM_API_KEY=dg_...
```

```json5 { tools: { media: { audio: { enabled: true, models: [{ provider: "deepgram", model: "nova-3" }], }, }, }, } ``` Send an audio message through any connected channel. OpenClaw transcribes it via Deepgram and injects the transcript into the reply pipeline.

Configuration options

Option	Path	Description
`model`	`tools.media.audio.models[].model`	Deepgram model id (default: `nova-3`)
`language`	`tools.media.audio.models[].language`	Language hint (optional)
`detect_language`	`tools.media.audio.providerOptions.deepgram.detect_language`	Enable language detection (optional)
`punctuate`	`tools.media.audio.providerOptions.deepgram.punctuate`	Enable punctuation (optional)
`smart_format`	`tools.media.audio.providerOptions.deepgram.smart_format`	Enable smart formatting (optional)

```json5 { tools: { media: { audio: { enabled: true, models: [{ provider: "deepgram", model: "nova-3", language: "en" }], }, }, }, } ``` ```json5 { tools: { media: { audio: { enabled: true, providerOptions: { deepgram: { detect_language: true, punctuate: true, smart_format: true, }, }, models: [{ provider: "deepgram", model: "nova-3" }], }, }, }, } ```

Voice Call streaming STT

The bundled deepgram plugin also registers a realtime transcription provider for the Voice Call plugin.

Setting	Config path	Default
API key	`plugins.entries.voice-call.config.streaming.providers.deepgram.apiKey`	Falls back to `DEEPGRAM_API_KEY`
Model	`...deepgram.model`	`nova-3`
Language	`...deepgram.language`	(unset)
Encoding	`...deepgram.encoding`	`mulaw`
Sample rate	`...deepgram.sampleRate`	`8000`
Endpointing	`...deepgram.endpointingMs`	`800`
Interim results	`...deepgram.interimResults`	`true`

{
  plugins: {
    entries: {
      "voice-call": {
        config: {
          streaming: {
            enabled: true,
            provider: "deepgram",
            providers: {
              deepgram: {
                apiKey: "${DEEPGRAM_API_KEY}",
                model: "nova-3",
                endpointingMs: 800,
                language: "en-US",
              },
            },
          },
        },
      },
    },
  },
}

Voice Call receives telephony audio as 8 kHz G.711 u-law. The Deepgram streaming provider defaults to `encoding: "mulaw"` and `sampleRate: 8000`, so Twilio media frames can be forwarded directly.

Notes

Authentication follows the standard provider auth order. `DEEPGRAM_API_KEY` is the simplest path. Override endpoints or headers with `tools.media.audio.baseUrl` and `tools.media.audio.headers` when using a proxy. Output follows the same audio rules as other providers (size caps, timeouts, transcript injection). Audio, image, and video processing pipeline overview. Full config reference including media tool settings. Common issues and debugging steps. Frequently asked questions about OpenClaw setup.

6.5 KiB Raw Blame History

Getting started

Configuration options

Voice Call streaming STT

Notes

Related

6.5 KiB

Raw Blame History