feat: declare explicit media provider capabilities

2026-04-12 09:41:11 +00:00 · 2026-04-06 15:24:16 +01:00
parent 29df67c491
commit cd5b1653f6
46 changed files with 1623 additions and 393 deletions
--- a/docs/plugins/sdk-provider-plugins.md
+++ b/docs/plugins/sdk-provider-plugins.md
@@ -643,10 +643,15 @@ API key auth, and dynamic model resolution.
    [Internals: Capability Ownership](/plugins/architecture#capability-ownership-model).

    For video generation, prefer the mode-aware capability shape shown above:
-    `generate`, `imageToVideo`, and `videoToVideo`. The older flat fields such
-    as `maxInputImages`, `maxInputVideos`, and `maxDurationSeconds` still work
-    as aggregate fallback caps, but they cannot describe per-mode limits or
-    disabled transform modes as cleanly.
+    `generate`, `imageToVideo`, and `videoToVideo`. Flat aggregate fields such
+    as `maxInputImages`, `maxInputVideos`, and `maxDurationSeconds` are not
+    enough to advertise transform-mode support or disabled modes cleanly.
+
+    Music-generation providers should follow the same pattern:
+    `generate` for prompt-only generation and `edit` for reference-image-based
+    generation. Flat aggregate fields such as `maxInputImages`,
+    `supportsLyrics`, and `supportsFormat` are not enough to advertise edit
+    support; explicit `generate` / `edit` blocks are the expected contract.

  </Step>