mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 14:10:51 +00:00
refactor(stt): share transcription helpers
This commit is contained in:
@@ -279,7 +279,7 @@ Current bundled provider examples:
|
||||
| `plugin-sdk/provider-model-shared` | Shared provider model/replay helpers | `ProviderReplayFamily`, `buildProviderReplayFamilyHooks`, `normalizeModelCompat`, shared replay-policy builders, provider-endpoint helpers, and model-id normalization helpers |
|
||||
| `plugin-sdk/provider-catalog-shared` | Shared provider catalog helpers | `findCatalogTemplate`, `buildSingleProviderApiKeyCatalog`, `supportsNativeStreamingUsageCompat`, `applyProviderNativeStreamingUsageCompat` |
|
||||
| `plugin-sdk/provider-onboard` | Provider onboarding patches | Onboarding config helpers |
|
||||
| `plugin-sdk/provider-http` | Provider HTTP helpers | Generic provider HTTP/endpoint capability helpers |
|
||||
| `plugin-sdk/provider-http` | Provider HTTP helpers | Generic provider HTTP/endpoint capability helpers, including audio transcription multipart form helpers |
|
||||
| `plugin-sdk/provider-web-fetch` | Provider web-fetch helpers | Web-fetch provider registration/cache helpers |
|
||||
| `plugin-sdk/provider-web-search-config-contract` | Provider web-search config helpers | Narrow web-search config/credential helpers for providers that do not need plugin-enable wiring |
|
||||
| `plugin-sdk/provider-web-search-contract` | Provider web-search contract helpers | Narrow web-search config/credential contract helpers such as `createWebSearchProviderContractFields`, `enablePluginInConfig`, `resolveProviderWebSearchPluginConfig`, and scoped credential setters/getters |
|
||||
|
||||
@@ -137,7 +137,7 @@ explicitly promotes one as public.
|
||||
| `plugin-sdk/provider-auth` | `createProviderApiKeyAuthMethod`, `ensureApiKeyFromOptionEnvOrPrompt`, `upsertAuthProfile`, `upsertApiKeyProfile`, `writeOAuthCredentials` |
|
||||
| `plugin-sdk/provider-model-shared` | `ProviderReplayFamily`, `buildProviderReplayFamilyHooks`, `normalizeModelCompat`, shared replay-policy builders, provider-endpoint helpers, and model-id normalization helpers such as `normalizeNativeXaiModelId` |
|
||||
| `plugin-sdk/provider-catalog-shared` | `findCatalogTemplate`, `buildSingleProviderApiKeyCatalog`, `supportsNativeStreamingUsageCompat`, `applyProviderNativeStreamingUsageCompat` |
|
||||
| `plugin-sdk/provider-http` | Generic provider HTTP/endpoint capability helpers |
|
||||
| `plugin-sdk/provider-http` | Generic provider HTTP/endpoint capability helpers, including audio transcription multipart form helpers |
|
||||
| `plugin-sdk/provider-web-fetch-contract` | Narrow web-fetch config/selection contract helpers such as `enablePluginInConfig` and `WebFetchProviderPlugin` |
|
||||
| `plugin-sdk/provider-web-fetch` | Web-fetch provider registration/cache helpers |
|
||||
| `plugin-sdk/provider-web-search-config-contract` | Narrow web-search config/credential helpers for providers that do not need plugin-enable wiring |
|
||||
|
||||
@@ -716,6 +716,17 @@ API key auth, and dynamic model resolution.
|
||||
as `maxInputImages`, `maxInputVideos`, and `maxDurationSeconds` are not
|
||||
enough to advertise transform-mode support or disabled modes cleanly.
|
||||
|
||||
Prefer the shared WebSocket helper for streaming STT providers. It keeps
|
||||
proxy capture, reconnect backoff, close flushing, ready handshakes, audio
|
||||
queueing, and close-event diagnostics consistent across providers while
|
||||
leaving provider code responsible for only the upstream event mapping.
|
||||
|
||||
Batch STT providers that POST multipart audio should use
|
||||
`buildAudioTranscriptionFormData(...)` from
|
||||
`openclaw/plugin-sdk/provider-http` together with the provider HTTP request
|
||||
helpers. The form helper normalizes upload filenames, including AAC uploads
|
||||
that need an M4A-style filename for compatible transcription APIs.
|
||||
|
||||
Music-generation providers should follow the same pattern:
|
||||
`generate` for prompt-only generation and `edit` for reference-image-based
|
||||
generation. Flat aggregate fields such as `maxInputImages`,
|
||||
|
||||
@@ -155,7 +155,8 @@ Current runtime behavior:
|
||||
|
||||
- `streaming.provider` is optional. If unset, Voice Call uses the first
|
||||
registered realtime transcription provider.
|
||||
- Bundled realtime transcription providers include OpenAI (`openai`) and xAI
|
||||
- Bundled realtime transcription providers include Deepgram (`deepgram`),
|
||||
ElevenLabs (`elevenlabs`), Mistral (`mistral`), OpenAI (`openai`), and xAI
|
||||
(`xai`), registered by their provider plugins.
|
||||
- Provider-owned raw config lives under `streaming.providers.<providerId>`.
|
||||
- If `streaming.provider` points at an unregistered provider, or no realtime
|
||||
|
||||
Reference in New Issue
Block a user