diff --git a/docs/plugins/architecture.md b/docs/plugins/architecture.md index e600c0c6960..0ae11418fb8 100644 --- a/docs/plugins/architecture.md +++ b/docs/plugins/architecture.md @@ -796,7 +796,9 @@ api.registerProvider({ - Kilocode uses `catalog`, `capabilities`, `wrapStreamFn`, and `isCacheTtlEligible` because it needs provider-owned request headers, reasoning payload normalization, Gemini transcript hints, and Anthropic - cache-TTL gating. + cache-TTL gating; the `kilocode-thinking` stream family keeps Kilo thinking + injection on the shared proxy stream path while skipping `kilo/auto` and + other proxy model ids that do not support explicit reasoning payloads. - Z.AI uses `resolveDynamicModel`, `prepareExtraParams`, `wrapStreamFn`, `isCacheTtlEligible`, `isBinaryThinking`, `isModernModelRef`, `resolveUsageAuth`, and `fetchUsageSnapshot` because it owns GLM-5 fallback, diff --git a/docs/plugins/sdk-provider-plugins.md b/docs/plugins/sdk-provider-plugins.md index a1a511eaa92..c9db48732d8 100644 --- a/docs/plugins/sdk-provider-plugins.md +++ b/docs/plugins/sdk-provider-plugins.md @@ -294,6 +294,7 @@ API key auth, and dynamic model resolution. | Family | What it wires in | | --- | --- | | `google-thinking` | Gemini thinking payload normalization on the shared stream path | + | `kilocode-thinking` | Kilo reasoning wrapper on the shared proxy stream path, with `kilo/auto` and unsupported proxy reasoning ids skipping injected thinking | | `moonshot-thinking` | Moonshot binary native-thinking payload mapping from config + `/think` level | | `minimax-fast-mode` | MiniMax fast-mode model rewrite on the shared stream path | | `openai-responses-defaults` | Shared native OpenAI/Codex Responses wrappers: attribution headers, `/fast`/`serviceTier`, text verbosity, native Codex web search, reasoning-compat payload shaping, and Responses context management | @@ -303,6 +304,7 @@ API key auth, and dynamic model resolution. Real bundled examples: - `google` and `google-gemini-cli`: `google-thinking` + - `kilocode`: `kilocode-thinking` - `moonshot`: `moonshot-thinking` - `minimax` and `minimax-portal`: `minimax-fast-mode` - `openai` and `openai-codex`: `openai-responses-defaults`