* fix(browser): discover CDP websocket from bare ws:// URL before attach
When browser.cdpUrl is set to a bare ws://host:port (no /devtools/ path), ensureBrowserAvailable would call isChromeReachable -> canOpenWebSocket against the URL verbatim. Chrome only accepts WebSocket upgrades at the specific path returned by /json/version, so the handshake failed immediately with HTTP 400. With attachOnly: true, that surfaced as:
Browser attachOnly is enabled and profile "openclaw" is not running.
even though the CDP endpoint was reachable and the profile was healthy. Reproduced by the new tests in chrome.test.ts and cdp.test.ts (#68027).
Fix: introduce isDirectCdpWebSocketEndpoint(url) — true only when a ws/wss URL has a /devtools/<kind>/<id> handshake path. Route any other ws/wss cdpUrl (including the bare ws://host:port shape) through HTTP /json/version discovery by normalising the scheme via the existing normalizeCdpHttpBaseForJsonEndpoints helper. Apply this in isChromeReachable, getChromeWebSocketUrl, and createTargetViaCdp. Direct WS endpoints with a /devtools/ path are still opened without an extra discovery round-trip.
Fixes#68027
* test(browser): add seeded fuzz coverage for CDP URL helpers
Adds property-based / seeded-fuzz tests for the URL helpers the
attachOnly CDP fix depends on (#68027):
- isWebSocketUrl
- isDirectCdpWebSocketEndpoint
- normalizeCdpHttpBaseForJsonEndpoints
- parseBrowserHttpUrl
- redactCdpUrl
- appendCdpPath
- getHeadersWithAuth
Follows the existing repo convention (see
src/gateway/http-common.fuzz.test.ts): no fast-check dep, small
mulberry32 PRNG + hand-rolled generators, deterministic per-describe
seeds so failures are reproducible.
Lifts cdp.helpers.ts coverage from 77.77% -> 89.54% statements,
67.9% -> 80.24% branches, 78% -> 90% lines. Remaining uncovered
lines are inside the WS sender internals (createCdpSender,
withCdpSocket, fetchCdpChecked rate-limit branch), which require
integration-style mocks and are unrelated to the attachOnly fix.
* test(browser): drive cdp.helpers/cdp/chrome to 100% coverage
Lifts the three files touched by the #68027 attachOnly fix to 100% statements/branches/functions/lines across the extensions test suite. Adds cdp.helpers.internal.test.ts, cdp.internal.test.ts, and chrome.internal.test.ts covering error paths, branch matrices, CDP session helpers, Chrome spawn/launch/stop flows, and canRunCdpHealthCommand. Defensively unreachable guards are annotated with c8 ignore + inline justifications.
* fix(browser): restore WS fallback for non-/devtools ws:// CDP URLs
When /json/version discovery is unavailable (or returns no
webSocketDebuggerUrl), fall back to treating the original bare ws/wss
URL as a direct WebSocket endpoint. This preserves the #68027 fix for
Chrome's debug port while restoring compatibility with Browserless/
Browserbase-style providers that expose a direct WebSocket root without
a /json/version endpoint.
Priority order for bare ws/wss cdpUrl inputs:
1. /devtools/<kind>/<id> URL \u2192 direct handshake, no discovery (unchanged)
2. bare ws/wss root \u2192 try HTTP discovery first; if discovery returns a
webSocketDebuggerUrl use it; otherwise fall back to the original URL
as a direct WS endpoint
3. HTTP/HTTPS URL \u2192 HTTP discovery only, no fallback (unchanged)
Affected call sites: isChromeReachable, getChromeWebSocketUrl,
createTargetViaCdp.
Also renames a misleading test ('still enforces SSRF policy for direct
WebSocket URLs') to accurately describe what it tests: SSRF enforcement
on the navigation target URL, not on the CDP endpoint.
New tests added for all three fallback paths. Coverage remains 100% on
all three touched files (238 tests).
* fix: browser attachOnly bare ws CDP follow-ups (#68715) (thanks @visionik)
* video_generate: add providerOptions, inputAudios, and imageRoles
- VideoGenerationSourceAsset gains an optional `role` field (e.g.
"first_frame", "last_frame"); core treats it as opaque and forwards it
to the provider unchanged.
- VideoGenerationRequest gains `inputAudios` (reference audio assets,
e.g. background music) and `providerOptions` (arbitrary
provider-specific key/value pairs forwarded as-is).
- VideoGenerationProviderCapabilities gains `maxInputAudios`.
- video_generate tool schema adds:
- `imageRoles` array (parallel to `images`, sets role per asset)
- `audioRef` / `audioRefs` (single/multi reference audio inputs)
- `providerOptions` (JSON object passed through to the provider)
- `MAX_INPUT_IMAGES` bumped 5 → 9; `MAX_INPUT_AUDIOS` = 3
- Capability validation extended to gate on `maxInputAudios`.
- runtime.ts threads `inputAudios` and `providerOptions` through to
`provider.generateVideo`.
- Docs and runtime tests updated.
Made-with: Cursor
* docs: fix BytePlus Seedance capability table — split 1.5 and 2.0 rows
1.5 Pro supports at most 2 input images (first_frame + last_frame);
2.0 supports up to 9 reference images, 3 videos, and 3 audios.
Provider notes section updated accordingly.
Made-with: Cursor
* docs: list all Seedance 1.0 models in video-generation provider table
- Default model updated to seedance-1-0-pro-250528 (was the T2V lite)
- Provider notes now enumerate all five 1.0 model IDs with T2V/I2V capability notes
Made-with: Cursor
* video_generate: address review feedback (P1/P2)
P1: Add "adaptive" to SUPPORTED_ASPECT_RATIOS so provider-specific ratio
passthrough (used by Seedance 1.5/2.0) is accepted instead of throwing.
Update error message to include "adaptive" in the allowed list.
P1: Fix audio input capability default — when a provider does not declare
maxInputAudios, default to 0 (no audio support) instead of MAX_INPUT_AUDIOS.
Providers must explicitly opt in via maxInputAudios to accept audio inputs.
P2: Remove unnecessary type cast in imageRoles assignment; VideoGenerationSourceAsset
already declares role?: string so a non-null assertion suffices.
P2: Add videoRoles and audioRoles tool parameters, parallel to imageRoles,
so callers can assign semantic role hints to reference video and audio assets
(e.g. "reference_video", "reference_audio" for Seedance 2.0).
Made-with: Cursor
* video_generate: fix check-docs formatting and snake_case param reading
Made-with: Cursor
* video_generate: clarify *Roles are parallel to combined input list (P2)
Made-with: Cursor
* video_generate: add missing duration import; fix corrupted docs section
Made-with: Cursor
* video_generate: pass mode inputs to duration resolver; note plugin requirement (P2)
Made-with: Cursor
* plugin-sdk: sync new video-gen fields — role, inputAudios, providerOptions, maxInputAudios
Add fields introduced by core in the PR1 batch to the public plugin-sdk
mirror so TypeScript provider plugins can declare and consume them
without type assertions:
- VideoGenerationSourceAsset.role?: string
- VideoGenerationRequest.inputAudios and .providerOptions
- VideoGenerationModeCapabilities.maxInputAudios
The AssertAssignable bidirectional checks still pass because all new
fields are optional; this change makes the SDK surface complete.
Made-with: Cursor
* video-gen runtime: skip failover candidates lacking audio capability
Made-with: Cursor
* video-gen: fall back to flat capabilities.maxInputAudios in failover and tool validation
Made-with: Cursor
* video-gen: defer audio-count check to runtime, enabling fallback for audio-capable candidates
Made-with: Cursor
* video-gen: defer maxDurationSeconds check to runtime, enabling fallback for higher-cap candidates
Made-with: Cursor
* video-gen: add VideoGenerationAssetRole union and typed providerOptions capability
Introduces a canonical VideoGenerationAssetRole union (first_frame,
last_frame, reference_image, reference_video, reference_audio) for the
source-asset role hint, and a VideoGenerationProviderOptionType tag
('number' | 'boolean' | 'string') plus a new capabilities.providerOptions
schema that providers use to declare which opaque providerOptions keys
they accept and with what primitive type.
Types are additive and backwards compatible. The role field accepts both
canonical union values and arbitrary provider-specific strings via a
`VideoGenerationAssetRole | (string & {})` union, so autocomplete works
for the common case without blocking provider-specific extensions.
Runtime enforcement of providerOptions (skip-in-fallback, unknown key
and type mismatch) lands in a follow-up commit.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* video-gen: enforce typed providerOptions schema via skip-in-fallback
Adds `validateProviderOptionsAgainstDeclaration` in the video-generation
runtime and wires it into the `generateVideo` candidate loop alongside
the existing audio-count and duration-cap skip guards.
Behavior:
- Candidates with no declared `capabilities.providerOptions` skip any
non-empty providerOptions payload with a clear skip reason, so a
provider that would ignore `{seed: 42}` and succeed without the
caller's intent never gets reached.
- Candidates that declare a schema reject unknown keys with the list
of accepted keys in the error.
- Candidates that declare a schema reject type mismatches (expected
number/boolean/string) with the declared type in the error.
- All skip reasons push into `attempts` so the aggregated failure
message at the end of the fallback chain explains exactly why each
candidate was rejected.
Also hardens the tool boundary: `providerOptions` that is not a plain
JSON object (including bogus arrays like `["seed", 42]`) now throws a
`ToolInputError` up front instead of being cast to `Record` and
forwarded with numeric-string keys.
Consistent with the audio/duration skip-in-fallback pattern introduced
by yongliang.xie in earlier commits on this branch.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* video-gen: harden *Roles parity + document canonical role values
Replaces the inline `parseRolesArg` lambda with a dedicated
`parseRoleArray` helper that throws a ToolInputError when the caller
supplies more roles than assets. Off-by-one alignment mistakes in
`imageRoles` / `videoRoles` / `audioRoles` now fail loudly at the tool
boundary instead of silently dropping trailing roles.
Also tightens the schema descriptions to document the canonical
VideoGenerationAssetRole values (first_frame, last_frame, reference_*)
and the skip-in-fallback contract on providerOptions, and rejects
non-array inputs to any `*Roles` field early rather than coercing them
to an empty list.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* video-gen: surface dropped aspectRatio sentinels in ignoredOverrides
"adaptive" and other provider-specific sentinel aspect ratios are
unparseable as numeric ratios, so when the active provider does not
declare the sentinel in caps.aspectRatios, `resolveClosestAspectRatio`
returns undefined and the previous code silently nulled out
`aspectRatio` without surfacing a warning.
Push the dropped value into `ignoredOverrides` so the tool result
warning path ("Ignored unsupported overrides for …") picks it up, and
the caller gets visible feedback that the request was dropped instead
of a silent no-op. Also corrects the tool-side comment on
SUPPORTED_ASPECT_RATIOS to describe actual behavior.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* video-gen: surface declared providerOptions + maxInputAudios in action=list
`video_generate action=list` now includes the declared providerOptions
schema (key:type) per provider, so agents can discover which opaque
keys each provider accepts without trial and error. Both mode-level and
flat-provider providerOptions declarations are merged, matching the
runtime lookup order in `generateVideo`.
Also surfaces `maxInputAudios` alongside the other max-input counts for
completeness — previously the list output did not expose the audio cap
at all, even though the tool validates against it.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* video-gen: warn once per request when runtime skips a fallback candidate
The skip-in-fallback guards (audio cap, duration cap, providerOptions)
all logged at debug level, which meant operators had no visible signal
when the primary provider was silently passed over in favor of a
fallback. Add a first-skip log.warn in the runtime loop so the reason
for the first rejection is surfaced once per request, and leave the
rest of the skip events at debug to avoid flooding on long chains.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* video-gen: cover new tool-level behavior with regression tests
Adds regression tests for:
- providerOptions shape rejection (arrays, strings)
- providerOptions happy-path forwarding to runtime
- imageRoles length-parity guard
- *Roles non-array rejection
- positional role attachment to loaded reference images
- audio data: URL templated rejection branch
- aspectRatio='adaptive' acceptance and forwarding
- unsupported aspectRatio rejection (mentions 'adaptive' in the error)
All eight new cases run in the existing video-generate-tool suite and
use the same provider-mock pattern already established in the file.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* video-gen: cover runtime providerOptions skip-in-fallback branches
Adds runtime regression tests for the new typed-providerOptions guard:
- candidates without a declared providerOptions schema are skipped
when any providerOptions is supplied (prevents silent drop)
- candidates that declare a schema skip on unknown keys with the
accepted-key list surfaced in the error
- candidates that declare a schema skip on type mismatches with the
declared type surfaced in the error
- end-to-end fallback: openai (no providerOptions) is skipped and
byteplus (declared schema) accepts the same request, with an
attempt entry recording the first skip reason
Also updates the existing 'forwards providerOptions to the provider
unchanged' case so the destination provider declares the matching
typed schema, and wires a `warn` stub into the hoisted logger mock
so the new first-skip log.warn call path does not blow up.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* changelog: note video_generate providerOptions / inputAudios / role hints
Adds an Unreleased Changes entry describing the user-visible surface
expansion for video_generate: typed providerOptions capability,
inputAudios reference audio, per-asset role hints via the canonical
VideoGenerationAssetRole union, the 'adaptive' aspect-ratio sentinel,
maxInputAudios capability, and the relaxed 9-image cap.
Credits the original PR author.
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
* byteplus: declare providerOptions schema (seed, draft, camerafixed) and forward to API
Made-with: Cursor
* byteplus: fix camera_fixed body field (API uses underscore, not camerafixed)
Made-with: Cursor
* fix(byteplus): normalize resolution to lowercase before API call
The Seedance API rejects resolution values with uppercase letters —
"480P", "720P" etc return InvalidParameter, while "480p", "720p"
are accepted. This was breaking the video generation live test
(resolveLiveVideoResolution returns "480P").
Normalize req.resolution to lowercase at the provider layer before
setting body.resolution, so any caller-supplied casing is corrected
without requiring changes to the VideoGenerationResolution type or
live-test helpers.
Verified via direct API call:
body.resolution = "480P" → HTTP 400 InvalidParameter
body.resolution = "480p" → task created successfully
body.resolution = "720p" → task created successfully (t2v, i2v, 1.5-pro)
body.resolution = "1080p" → task created successfully
Made-with: Cursor
* video-gen/byteplus: auto-select i2v model when input images provided with t2v model
Seedance 1.0 uses separate model IDs for T2V (seedance-1-0-lite-t2v-250428)
and I2V (seedance-1-0-lite-i2v-250428). When the caller requests a T2V model
but also provides inputImages, the API rejects with task_type i2v not supported
on t2v model.
Fix: when inputImages are present and the requested model contains "-t2v-",
auto-substitute "-i2v-" so the API receives the correct model. Seedance 1.5 Pro
uses a single model ID for both modes and is unaffected by this substitution.
Verified via live test: both mode=generate and mode=imageToVideo pass for
byteplus/seedance-1-0-lite-t2v-250428 with no failures.
Co-authored-by: odysseus0 <odysseus0@example.com>
Made-with: Cursor
* video-gen: fix duration rounding + align BytePlus (1.0) docs (P2)
Made-with: Cursor
* video-gen: relax providerOptions gate for undeclared-schema providers (P1)
Distinguish undefined (not declared = backward-compat pass-through) from
{} (explicitly declared empty = no options accepted) in
validateProviderOptionsAgainstDeclaration. Providers without a declared
schema receive providerOptions as-is; providers with an explicit empty
schema still skip. Typed schemas continue to validate key names and types.
Also: restore camera_fixed (underscore) in BytePlus provider schema and
body key (regression from earlier rebase), remove duplicate local
readBooleanToolParam definition now imported from media-tool-shared,
update tests and docs accordingly.
Made-with: Cursor
* video_generate: add landing follow-up coverage
* video_generate: finalize plugin-sdk baseline (#61987) (thanks @xieyongliang)
---------
Co-authored-by: yongliang.xie <yongliang.xie@bytedance.com>
Co-authored-by: George Zhang <georgezhangtj97@gmail.com>
Co-authored-by: odysseus0 <odysseus0@example.com>