openclaw/docs/providers/inworld.md at main

mirror of https://github.com/openclaw/openclaw.git synced 2026-05-06 06:10:44 +00:00

Files

Vincent Koc fcb188a41a docs(providers): tighten SenseAudio, Xiaomi, and Inworld pages

SenseAudio (docs/providers/senseaudio.md): removed the duplicate
'# SenseAudio' H1 (Mintlify renders title from frontmatter; an in-body
H1 produces a brittle anchor). Reordered the properties table to lead
with provider id, plugin, and the speechProviders/mediaUnderstanding
contract before the website/docs links, sourced from
extensions/senseaudio/openclaw.plugin.json. Lowercased the H2 to
'Getting started' for consistency.

Xiaomi (docs/providers/xiaomi.md): expanded the 4-row properties table
to include plugin, onboarding flag, direct CLI flag, and contracts
(chat completions plus speechProviders). The TTS default is surfaced
inline so readers see the dual-contract setup in one glance, sourced
from extensions/xiaomi/openclaw.plugin.json.

Inworld (docs/providers/inworld.md): renamed the table header from
'Detail' to 'Property' and added bundled-plugin status and the
speechProviders contract. Surfaced the audio output formats (MP3,
OGG_OPUS, PCM 22050 Hz) as a Property row so readers do not have to
read the Audio outputs accordion to confirm telephony support.
Verified against extensions/inworld/openclaw.plugin.json.

2026-05-05 17:33:59 -07:00

5.1 KiB

Raw Permalink Blame History

summary, read_when, title

summary

read_when

title

Inworld streaming text-to-speech for OpenClaw replies

You want Inworld speech synthesis for outbound replies

You need PCM telephony or OGG_OPUS voice-note output from Inworld

Inworld

Inworld is a streaming text-to-speech (TTS) provider. In OpenClaw it synthesizes outbound reply audio (MP3 by default, OGG_OPUS for voice notes) and PCM audio for telephony channels such as Voice Call.

OpenClaw posts to Inworld's streaming TTS endpoint, concatenates the returned base64 audio chunks into a single buffer, and hands the result to the standard reply-audio pipeline.

Property	Value
Provider id	`inworld`
Plugin	bundled, `enabledByDefault: true`
Contract	`speechProviders` (TTS only)
Auth env var	`INWORLD_API_KEY` (HTTP Basic, Base64 dashboard credential)
Base URL	`https://api.inworld.ai`
Default voice	`Sarah`
Default model	`inworld-tts-1.5-max`
Output	MP3 (default), OGG_OPUS (voice notes), PCM 22050 Hz (telephony)
Website	inworld.ai
Docs	docs.inworld.ai/tts/tts

Getting started

Copy the credential from your Inworld dashboard (Workspace > API Keys) and set it as an env var. The value is sent verbatim as the HTTP Basic credential, so do not Base64-encode it again or convert it to a bearer token.

```
INWORLD_API_KEY=<base64-credential-from-dashboard>
```

```json5 { messages: { tts: { auto: "always", provider: "inworld", providers: { inworld: { voiceId: "Sarah", modelId: "inworld-tts-1.5-max", }, }, }, }, } ``` Send a reply through any connected channel. OpenClaw synthesizes the audio with Inworld and delivers it as MP3 (or OGG_OPUS when the channel expects a voice note).

Configuration options

Option	Path	Description
`apiKey`	`messages.tts.providers.inworld.apiKey`	Base64 dashboard credential. Falls back to `INWORLD_API_KEY`.
`baseUrl`	`messages.tts.providers.inworld.baseUrl`	Override Inworld API base URL (default `https://api.inworld.ai`).
`voiceId`	`messages.tts.providers.inworld.voiceId`	Voice identifier (default `Sarah`).
`modelId`	`messages.tts.providers.inworld.modelId`	TTS model id (default `inworld-tts-1.5-max`).
`temperature`	`messages.tts.providers.inworld.temperature`	Sampling temperature `0..2` (optional).

Notes

Inworld uses HTTP Basic auth with a single Base64-encoded credential string. Copy it verbatim from the Inworld dashboard. The provider sends it as `Authorization: Basic ` without any further encoding, so do not Base64-encode it yourself and do not pass a bearer-style token. See [TTS auth notes](/tools/tts#inworld-primary) for the same callout. Supported model ids: `inworld-tts-1.5-max` (default), `inworld-tts-1.5-mini`, `inworld-tts-1-max`, `inworld-tts-1`. Replies use MP3 by default. When the channel target is `voice-note` OpenClaw asks Inworld for `OGG_OPUS` so the audio plays as a native voice bubble. Telephony synthesis uses raw `PCM` at 22050 Hz to feed the telephony bridge. Override the API host with `messages.tts.providers.inworld.baseUrl`. Trailing slashes are stripped before requests are sent. TTS overview, providers, and `messages.tts` config. Full config reference including `messages.tts` settings. All bundled OpenClaw providers. Common issues and debugging steps.

5.1 KiB Raw Permalink Blame History

Getting started

Configuration options

Notes

Related

5.1 KiB

Raw Permalink Blame History