Commit Graph

16323 Commits

Author SHA1 Message Date
Peter Steinberger
6545317a2c refactor(media): split audio helpers and attachment cache 2026-03-02 22:01:24 +00:00
Peter Steinberger
9bde7f4fde perf: cache allowlist and account-id normalization 2026-03-02 21:58:35 +00:00
Peter Steinberger
3beb1b9da9 test: speed up heavy suites with shared fixtures 2026-03-02 21:58:35 +00:00
Peter Steinberger
6358aae024 refactor(infra): share windows path normalization helper 2026-03-02 21:55:12 +00:00
Peter Steinberger
55a2d12f40 refactor: split inbound and reload pipelines into staged modules 2026-03-02 21:55:01 +00:00
Peter Steinberger
99a3db6ba9 fix(zalouser): enforce group mention gating and typing 2026-03-02 21:53:54 +00:00
Peter Steinberger
e5597a8dd4 refactor(media): dedupe tiny-audio test setup and normalize guards formatting 2026-03-02 21:50:54 +00:00
Peter Steinberger
8e259b8310 fix: keep audio transcript echo off-by-default and tiny-audio-safe (#32150) 2026-03-02 21:48:08 +00:00
AytuncYildizli
8f995dfc7a fix(audio): add echoTranscript/echoFormat to Zod config schema 2026-03-02 21:47:09 +00:00
AytuncYildizli
1b61269eec feat(audio): auto-echo transcription to chat before agent processing
When echoTranscript is enabled in tools.media.audio config, the
transcription text is sent back to the originating chat immediately
after successful audio transcription — before the agent processes it.
This lets users verify what was heard from their voice note.

Changes:
- config/types.tools.ts: add echoTranscript (bool) and echoFormat
  (string template) to MediaUnderstandingConfig
- media-understanding/apply.ts: sendTranscriptEcho() helper that
  resolves channel/to from ctx, guards on isDeliverableMessageChannel,
  and calls deliverOutboundPayloads best-effort
- config/schema.help.ts: help text for both new fields
- config/schema.labels.ts: labels for both new fields
- media-understanding/apply.echo-transcript.test.ts: 10 vitest cases
  covering disabled/enabled/custom-format/no-audio/failed-transcription/
  non-deliverable-channel/missing-from/OriginatingTo/delivery-failure

Default echoFormat: '📝 "{transcript}"'

Closes #32102
2026-03-02 21:47:09 +00:00
Shawn
ef89b48785 fix(agents): normalize windows workspace path boundary checks (#30766)
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-03-02 15:47:02 -06:00
Peter Steinberger
a183656f8f fix: apply missed media/runtime follow-ups from merged PRs 2026-03-02 21:45:39 +00:00
Peter Steinberger
f2b37f0aa9 refactor(media): dedupe runner proxy and video test fixtures 2026-03-02 21:44:52 +00:00
benthecarman
faa4ffec03 Add runtime.stt.transcribeAudioFile for plugin STT access
Expose audio transcription through the PluginRuntime so external
plugins (e.g. marmot) can use openclaw's media-understanding provider
framework without importing unexported internal modules.

The new transcribeAudioFile() wraps runCapability({capability: "audio"})
and reads provider/model/apiKey from tools.media.audio in the config,
matching the pattern used by the Discord VC implementation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-02 21:43:01 +00:00
Glucksberg
f7b0378ccb fix(test): update media-understanding tests for whisper skip empty audio
Increase test audio file sizes to meet MIN_AUDIO_FILE_BYTES (1024) threshold
introduced by the skip-empty-audio feature. Fix localPathRoots in skip-tiny-audio
tests so temp files pass path validation. Remove undefined loadApply() call
in apply.test.ts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:41:09 +00:00
Glucksberg
5f19112217 fix(test): use strict assertion instead of optional chaining 2026-03-02 21:41:09 +00:00
Glucksberg
8039ef7dba test: add URL-only audio skip test for tiny remote attachments 2026-03-02 21:41:09 +00:00
Glucksberg
43f94e3ab8 fix: strengthen test assertions - assert array lengths before indexing 2026-03-02 21:41:09 +00:00
Glucksberg
8b70ba6ab8 fix(#8127): auto-skip tiny/empty audio files in whisper transcription
Add a minimum file size guard (MIN_AUDIO_FILE_BYTES = 1024) before
sending audio to transcription APIs. Files below this threshold are
almost certainly empty or corrupt and would cause unhelpful errors
from Whisper/Deepgram/Groq providers.

Changes:
- Add 'tooSmall' skip reason to MediaUnderstandingSkipError
- Add MIN_AUDIO_FILE_BYTES constant (1024 bytes) to defaults
- Guard both provider and CLI audio paths in runner.ts
- Add comprehensive tests for tiny, empty, and valid audio files
- Update existing test fixtures to use audio files above threshold
2026-03-02 21:41:09 +00:00
Peter Steinberger
036bd18e2a docs(changelog): fix 2026.3.1 split and dedupe entries 2026-03-02 21:40:57 +00:00
Clawrence
9c9ab891c2 fix(media-understanding): guard malformed attachments arrays 2026-03-02 21:39:57 +00:00
Peter Steinberger
f7c658efb9 fix(core): resolve post-rebase type errors 2026-03-02 21:39:43 +00:00
Marcus Castro
58cde87436 fix: warn when proxy env var is set but agent creation fails 2026-03-02 21:37:36 +00:00
Marcus Castro
8c1e9949b3 fix: pass proxy-aware fetchFn to media understanding providers
runProviderEntry now calls resolveProxyFetchFromEnv() and passes the
result as fetchFn to transcribeAudio/describeVideo, so media provider
API calls respect HTTPS_PROXY/HTTP_PROXY behind corporate proxies.
2026-03-02 21:37:36 +00:00
Marcus Castro
ba3fa44c5b refactor: extract shared proxy-fetch utility from Telegram module
Move makeProxyFetch to src/infra/net/proxy-fetch.ts and add
resolveProxyFetchFromEnv which reads standard proxy env vars
(HTTPS_PROXY, HTTP_PROXY, and lowercase variants) and returns a
proxy-aware fetch via undici's EnvHttpProxyAgent. Telegram re-exports
from the shared location to avoid duplication.
2026-03-02 21:37:36 +00:00
Peter Steinberger
5897eed6e9 refactor(core): dedupe final pairing and sandbox media clones 2026-03-02 21:36:23 +00:00
Peter Steinberger
453a1c179d fix: restore release-check control flow after export guard merge 2026-03-02 21:35:12 +00:00
openjay
76d6514ff5 fix: add "audio" to openai provider capabilities
The openai provider implements transcribeAudio via
transcribeOpenAiCompatibleAudio (Whisper API), but its capabilities
array only declared ["image"]. This caused the media-understanding
runner to skip the openai provider when processing inbound audio
messages, resulting in raw audio files being passed to agents
instead of transcribed text.

Fix: Add "audio" to the capabilities array so the runner correctly
selects the openai provider for audio transcription.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-03-02 21:33:54 +00:00
Peter Steinberger
6a425d189e refactor(channels): dedupe slack telegram and web monitor tests 2026-03-02 21:32:11 +00:00
Peter Steinberger
34daed1d1e refactor(core): dedupe infra, media, pairing, and plugin helpers 2026-03-02 21:32:11 +00:00
Peter Steinberger
91dd89313a refactor(core): dedupe command, hook, and cron fixtures 2026-03-02 21:31:36 +00:00
Peter Steinberger
5f0cbd0edc refactor(gateway): dedupe auth and discord monitor suites 2026-03-02 21:31:36 +00:00
Peter Steinberger
ab8b8dae70 refactor(agents): dedupe model and tool test helpers 2026-03-02 21:31:36 +00:00
Peter Steinberger
067855e623 refactor(browser): dedupe browser and cli command wiring 2026-03-02 21:31:36 +00:00
Glucksberg
58e9ca2fb6 fix(release-check): add 4 missing plugin-sdk exports to align with check script 2026-03-02 21:30:44 +00:00
Glucksberg
61d14e8a8a fix(plugin-sdk): add export verification tests and release guard (#27569) 2026-03-02 21:30:44 +00:00
Peter Steinberger
2438fde6d9 fix: trim repeated slack thread context payloads (#32133) (thanks @sourman) 2026-03-02 21:29:36 +00:00
Ahmed Mansour
7a99027ef6 fix(slack): reduce token bloat by skipping thread context on existing sessions
Thread history and thread starter were being fetched and included on
every message in a Slack thread, causing unnecessary token bloat. The
session transcript already contains the full conversation history, so
re-fetching and re-injecting thread history on each turn is redundant.

Now thread history is only fetched for new thread sessions
(!threadSessionPreviousTimestamp). Existing sessions rely on their
transcript for context.

Fixes #32121
2026-03-02 21:29:36 +00:00
Peter Steinberger
42e402dfba fix: clear pending tool-call state across provider modes (#32120) (thanks @jnMetaCode) 2026-03-02 21:28:02 +00:00
jiangnan
11aa18b525 fix(agents): clear pending tool call state on interruption regardless of provider
When `allowSyntheticToolResults` is false (OpenAI, OpenRouter, and most
third-party providers), the guard never cleared its pending tool call map
when a user message arrived during in-flight tool execution. This left
orphaned tool_use blocks in the transcript with no matching tool_result,
causing the provider API to reject all subsequent requests with 400 errors
and permanently breaking the session.

The fix removes the `allowSyntheticToolResults` gate around the flush
calls. `flushPendingToolResults()` already handles both cases correctly:
it only inserts synthetic results when allowed, and always clears the
pending map. The gate was preventing the map from being cleared at all
for providers that disable synthetic results.

Fixes #32098

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:28:02 +00:00
Peter Steinberger
21d6d878ce fix: harden exec allowlist regex literal handling (#32162) (thanks @stakeswky) 2026-03-02 21:26:24 +00:00
User
8da8756f76 fix(exec): escape regex literals in allowlist path matching 2026-03-02 21:26:24 +00:00
George Pickett
a4927ed8ee fix: OpenAI OAuth TLS preflight gating (#32051) (thanks @alexfilatov) 2026-03-02 13:24:49 -08:00
George Pickett
1f24323583 Auth: gate OpenAI OAuth TLS preflight in doctor 2026-03-02 13:24:49 -08:00
Alex Filatov
dc8a56c857 Fix TLS cert preflight classification false positive 2026-03-02 13:24:49 -08:00
Alex Filatov
f181b7dbe6 Add OpenAI OAuth TLS preflight and doctor prerequisite check 2026-03-02 13:24:49 -08:00
scoootscooob
0f1388fa15 fix(gateway): hot-reload channelHealthCheckMinutes without full restart
The health monitor was created once at startup and never touched by
applyHotReload(), so changing channelHealthCheckMinutes only took
effect after a full gateway restart.

Wire up a "restart-health-monitor" reload action so hot-reload can
stop the old monitor and (re)create one with the updated interval —
or disable it entirely when set to 0.

Closes #32105

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:23:20 +00:00
Peter Steinberger
b782ecb7eb refactor: harden plugin install flow and main DM route pinning 2026-03-02 21:22:38 +00:00
Peter Steinberger
af637deed1 fix: propagate whatsapp inbound fromMe context (#32167) (thanks @scoootscooob) 2026-03-02 21:20:21 +00:00
scoootscooob
73e6dc361e fix(whatsapp): propagate fromMe through inbound message pipeline
The `fromMe` flag from Baileys' WAMessage.key was only used for
access-control filtering and then discarded.  This meant agents
could not distinguish owner-sent messages from contact messages
in DM conversations (everything appeared as from the contact).

Add `fromMe` to `WebInboundMessage`, store it during message
construction, and thread it through `buildInboundLine` →
`formatInboundEnvelope` so DM transcripts prefix owner messages
with `(self):`.

Closes #32061

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:20:21 +00:00