Files
openclaw/src/auto-reply/types.ts
Sean c6f2db1506 fix: prevent gateway attachment offload regressions (#55513) (thanks @Syysean)
* feat(gateway): implement claim check pattern to prevent OOM on large attachments

* fix: sanitize mediaId, refine trimEnd, remove warn log, add threshold and absolute path

* fix: enforce maxBytes before decoding and use dynamic path from saveMediaBuffer

* fix: enforce absolute maxBytes limit before Buffer allocation and preserve file extensions

* fix: align saveMediaBuffer arguments and satisfy oxfmt linter

* chore: strictly enforce linting rules (curly braces, unused vars, and error typing)

* fix: restrict offload to mainstream mimes to avoid extension-loss bug in store.ts for BMP/TIFF

* fix: restrict offload to mainstream mimes to bypass store.ts extension-loss bug

* chore: document bmp/tiff exclusion from offload whitelist in MIME_TO_EXT

* feat: implement agent-side resolver for opaque media URIs and finalize contract

* fix: support unicode media URIs and allow consecutive dots in safe IDs based on Codex review

* fix(gateway): enforce strict fail-fast for oversized media to prevent OOM bypass

* refactor(gateway): harden media offload with performance and security optimizations

This update refines the Claim Check pattern with industrial-grade guards:

- Performance: Implemented sampled Base64 validation for large payloads (>4KB) to prevent event loop blocking.
- Security: Added null-byte (\u0000) detection and reinforced path traversal guards.
- I18n: Updated media-uri regex to a blacklist-based character class for Unicode/Chinese filename support, with oxlint bypass for intentional control regex.
- Robustness: Enhanced error diagnostics with JSON-serialized IDs.

* fix: add HEIC/HEIF to offload allowlist and pass maxBytes to saveMediaBuffer

* fix(gateway): clean up offloaded media files on attachment parse failure

Address Codex review feedback: track saved media IDs and implement best-effort cleanup via deleteMediaBuffer if subsequent attachments fail validation, preventing orphaned files on disk.

* fix(gateway): enforce full base64 validation to prevent whitespace padding bypass

Address Codex review feedback: remove early return in isValidBase64 so padded payloads cannot bypass offload thresholds and reintroduce memory pressure. Updated related comments.

* fix(gateway): preserve offloaded media metadata and fix validation error mapping

Address Codex review feedback:
- Add \offloadedRefs\ to \ParsedMessageWithImages\ to expose structured metadata for offloaded attachments, preventing transcript media loss.
- Move \erifyDecodedSize\ outside the storage try-catch block to correctly surface client base64 validation failures as 4xx errors instead of 5xx \MediaOffloadError\.
- Add JSDoc TODOs indicating that upstream callers (chat.ts, agent.ts, server-node-events.ts) must explicitly pass the \supportsImages\ flag.

* fix(agents): explicitly allow media store dir when loading offloaded images

Address Codex review feedback: Pass getMediaDir() to loadWebMedia's localRoots for media-uri refs to prevent legacy path resolution mismatches from silently dropping large attachments.

* fix(gateway): resolve attachment offload regressions and error mapping

Address Codex review feedback:
- Pass \supportsImages\ dynamically in \chat.ts\ and \gent.ts\ based on model catalog, and explicitly in \server-node-events.ts\.
- Persist \offloadedRefs\ into the transcript pipeline in \chat.ts\ to preserve media metadata for >2MB attachments.
- Correctly map \MediaOffloadError\ to 5xx (UNAVAILABLE) to differentiate server storage faults from 4xx client validation errors.

* fix(gateway): dynamically compute supportsImages for overrides and node events

Address follow-up Codex review feedback:

- Use effective model (including overrides) to compute \supportsImages\ in \gent.ts\.

- Move session load earlier in \server-node-events.ts\ to dynamically compute \supportsImages\ rather than hardcoding true.

* fix(gateway): resolve capability edge cases reported by codex

Address final Codex edge cases:
- Refactor \gent.ts\ to compute \supportsImages\ even when no session key is present, ensuring text-only override requests without sessions safely drop attachments.
- Update catalog lookups in \chat.ts\, \gent.ts\, and \server-node-events.ts\ to strictly match both \id\ and \provider\ to prevent cross-provider model collisions.

* fix(agents): restore before_install hook for skill installs

Restore the plugin scanner security hook that was accidentally dropped during merge conflict resolution.

* fix: resolve attachment pathing, defer parsing after auth gates, and clean up node-event mocks

* fix: resolve syntax errors in test-env, fix missing helper imports, and optimize parsing sequence in node events

* fix(gateway): re-enforce message length limit after attachment parsing

Adds a secondary check to ensure the 20,000-char cap remains effective even after media markers are appended during the offload flow.

* fix(gateway): prevent dropping valid small images and clean up orphaned media on size rejection

* fix(gateway): share attachment image capability checks

* fix(gateway): preserve mixed attachment order

* fix: fail closed on unknown image capability (#55513) (thanks @Syysean)

* fix: classify offloaded attachment refs explicitly (#55513) (thanks @Syysean)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-03-30 20:54:40 +05:30

104 lines
4.6 KiB
TypeScript

import type { ImageContent } from "@mariozechner/pi-ai";
import type { InteractiveReply } from "../interactive/payload.js";
import type { PromptImageOrderEntry } from "../media/prompt-image-order.js";
import type { TypingController } from "./reply/typing.js";
export type BlockReplyContext = {
abortSignal?: AbortSignal;
timeoutMs?: number;
};
/** Context passed to onModelSelected callback with actual model used. */
export type ModelSelectedContext = {
provider: string;
model: string;
thinkLevel: string | undefined;
};
export type TypingPolicy =
| "auto"
| "user_message"
| "system_event"
| "internal_webchat"
| "heartbeat";
export type GetReplyOptions = {
/** Override run id for agent events (defaults to random UUID). */
runId?: string;
/** Abort signal for the underlying agent run. */
abortSignal?: AbortSignal;
/** Optional inbound images (used for webchat attachments). */
images?: ImageContent[];
/** Original inline/offloaded attachment order for inbound images. */
imageOrder?: PromptImageOrderEntry[];
/** Notifies when an agent run actually starts (useful for webchat command handling). */
onAgentRunStart?: (runId: string) => void;
onReplyStart?: () => Promise<void> | void;
/** Called when the typing controller cleans up (e.g., run ended with NO_REPLY). */
onTypingCleanup?: () => void;
onTypingController?: (typing: TypingController) => void;
isHeartbeat?: boolean;
/** Policy-level typing control for run classes (user/system/internal/heartbeat). */
typingPolicy?: TypingPolicy;
/** Force-disable typing indicators for this run (system/internal/cross-channel routes). */
suppressTyping?: boolean;
/** Resolved heartbeat model override (provider/model string from merged per-agent config). */
heartbeatModelOverride?: string;
/** Controls bootstrap workspace context injection (default: full). */
bootstrapContextMode?: "full" | "lightweight";
/** If true, suppress tool error warning payloads for this run. */
suppressToolErrorWarnings?: boolean;
onPartialReply?: (payload: ReplyPayload) => Promise<void> | void;
onReasoningStream?: (payload: ReplyPayload) => Promise<void> | void;
/** Called when a thinking/reasoning block ends. */
onReasoningEnd?: () => Promise<void> | void;
/** Called when a new assistant message starts (e.g., after tool call or thinking block). */
onAssistantMessageStart?: () => Promise<void> | void;
onBlockReply?: (payload: ReplyPayload, context?: BlockReplyContext) => Promise<void> | void;
onToolResult?: (payload: ReplyPayload) => Promise<void> | void;
/** Called when a tool phase starts/updates, before summary payloads are emitted. */
onToolStart?: (payload: { name?: string; phase?: string }) => Promise<void> | void;
/** Called when context auto-compaction starts (allows UX feedback during the pause). */
onCompactionStart?: () => Promise<void> | void;
/** Called when context auto-compaction completes. */
onCompactionEnd?: () => Promise<void> | void;
/** Called when the actual model is selected (including after fallback).
* Use this to get model/provider/thinkLevel for responsePrefix template interpolation. */
onModelSelected?: (ctx: ModelSelectedContext) => void;
disableBlockStreaming?: boolean;
/** Timeout for block reply delivery (ms). */
blockReplyTimeoutMs?: number;
/** If provided, only load these skills for this session (empty = no skills). */
skillFilter?: string[];
/** Mutable ref to track if a reply was sent (for Slack "first" threading mode). */
hasRepliedRef?: { value: boolean };
/** Override agent timeout in seconds (0 = no timeout). Threads through to resolveAgentTimeoutMs. */
timeoutOverrideSeconds?: number;
};
export type ReplyPayload = {
text?: string;
mediaUrl?: string;
mediaUrls?: string[];
interactive?: InteractiveReply;
btw?: {
question: string;
};
replyToId?: string;
replyToTag?: boolean;
/** True when [[reply_to_current]] was present but not yet mapped to a message id. */
replyToCurrent?: boolean;
/** Send audio as voice message (bubble) instead of audio file. Defaults to false. */
audioAsVoice?: boolean;
isError?: boolean;
/** Marks this payload as a reasoning/thinking block. Channels that do not
* have a dedicated reasoning lane (e.g. WhatsApp, web) should suppress it. */
isReasoning?: boolean;
/** Marks this payload as a compaction status notice (start/end).
* Should be excluded from TTS transcript accumulation so compaction
* status lines are not synthesised into the spoken assistant reply. */
isCompactionNotice?: boolean;
/** Channel-specific payload data (per-channel envelope). */
channelData?: Record<string, unknown>;
};