fix(agents): honor qwen chat-template thinking compat

2026-05-06 19:00:45 +00:00 · 2026-04-27 11:26:23 +01:00
parent 3db407da40
commit 75c8c1bebe
6 changed files with 128 additions and 3 deletions
--- a/docs/providers/vllm.md
+++ b/docs/providers/vllm.md
@@ -129,6 +129,27 @@ Use explicit config when:

  </Accordion>

+  <Accordion title="Qwen thinking controls">
+    For Qwen models served through vLLM, set
+    `compat.thinkingFormat: "qwen-chat-template"` on the model entry when the
+    server expects Qwen chat-template kwargs. OpenClaw maps `/think off` to:
+
+    ```json
+    {
+      "chat_template_kwargs": {
+        "enable_thinking": false,
+        "preserve_thinking": true
+      }
+    }
+    ```
+
+    Non-`off` thinking levels send `enable_thinking: true`. If your endpoint
+    expects DashScope-style top-level flags instead, use
+    `compat.thinkingFormat: "qwen"` to send `enable_thinking` at the request
+    root.
+
+  </Accordion>
+
  <Accordion title="Nemotron 3 thinking controls">
    vLLM/Nemotron 3 can use chat-template kwargs to control whether reasoning is
    returned as hidden reasoning or visible answer text. When an OpenClaw session