refactor(vllm): own qwen thinking payloads

2026-05-06 12:10:42 +00:00 · 2026-04-27 11:47:54 +01:00
parent 4f7038ae33
commit 836d4b4105
20 changed files with 467 additions and 129 deletions
--- a/docs/providers/qwen.md
+++ b/docs/providers/qwen.md
@@ -169,6 +169,13 @@ Availability can still vary by endpoint and billing plan even when a model is
 present in the bundled catalog.
 </Note>

+## Thinking Controls
+
+For reasoning-enabled Qwen Cloud models, the bundled provider maps OpenClaw
+thinking levels to DashScope's top-level `enable_thinking` request flag. Disabled
+thinking sends `enable_thinking: false`; other thinking levels send
+`enable_thinking: true`.
+
 ## Multimodal add-ons

 The `qwen` plugin also exposes multimodal capabilities on the **Standard**
--- a/docs/providers/vllm.md
+++ b/docs/providers/vllm.md
@@ -131,7 +131,7 @@ Use explicit config when:

  <Accordion title="Qwen thinking controls">
    For Qwen models served through vLLM, set
-    `compat.thinkingFormat: "qwen-chat-template"` on the model entry when the
+    `params.qwenThinkingFormat: "chat-template"` on the model entry when the
    server expects Qwen chat-template kwargs. OpenClaw maps `/think off` to:

    ```json
@@ -145,8 +145,8 @@ Use explicit config when:

    Non-`off` thinking levels send `enable_thinking: true`. If your endpoint
    expects DashScope-style top-level flags instead, use
-    `compat.thinkingFormat: "qwen"` to send `enable_thinking` at the request
-    root.
+    `params.qwenThinkingFormat: "top-level"` to send `enable_thinking` at the
+    request root. Snake-case `params.qwen_thinking_format` is also accepted.

  </Accordion>