From f9b78fb08ecefbb72c4640789b882caa4d09cd1a Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Mon, 27 Apr 2026 10:37:48 +0100 Subject: [PATCH] docs(models): clarify local tool call workaround --- docs/gateway/local-models.md | 21 ++++++++++++++--- docs/providers/vllm.md | 45 ++++++++++++++++++++++++++++++------ 2 files changed, 56 insertions(+), 10 deletions(-) diff --git a/docs/gateway/local-models.md b/docs/gateway/local-models.md index efc54d843af..149f87f7eb7 100644 --- a/docs/gateway/local-models.md +++ b/docs/gateway/local-models.md @@ -174,9 +174,12 @@ Compatibility notes for stricter OpenAI-compatible backends: text and logs a warning with the run id, provider/model, detected pattern, and tool name when available. Treat that as provider/model tool-call incompatibility, not a completed tool run. -- For OpenAI-compatible Chat Completions backends whose tool parser works only - when tool use is forced, set a per-model request override instead of relying - on text parsing: +- If tools appear as assistant text instead of running, for example raw JSON, + XML, ReAct syntax, or an empty `tool_calls` array in the provider response, + first verify the server is using a tool-call-capable chat template/parser. For + OpenAI-compatible Chat Completions backends whose parser works only when tool + use is forced, set a per-model request override instead of relying on text + parsing: ```json5 { @@ -198,6 +201,12 @@ Compatibility notes for stricter OpenAI-compatible backends: Use this only for models/sessions where every normal turn should call a tool. It overrides OpenClaw's default proxy value of `tool_choice: "auto"`. + Replace `local/my-local-model` with the exact provider/model ref shown by + `openclaw models list`. + + ```bash + openclaw config set agents.defaults.models '{"local/my-local-model":{"params":{"extra_body":{"tool_choice":"required"}}}}' --strict-json --merge + ``` - Some smaller or stricter local backends are unstable with OpenClaw's full agent-runtime prompt shape, especially when tool schemas are included. If the @@ -229,6 +238,12 @@ Compatibility notes for stricter OpenAI-compatible backends: fails on Gemma or another local model? Disable tool schemas first with `compat.supportsTools: false`, then retest. If the server still crashes only on larger OpenClaw prompts, treat it as an upstream server/model limitation. +- Tool calls show up as raw JSON/XML/ReAct text, or the provider returns an + empty `tool_calls` array? Do not add a proxy that blindly converts assistant + text into tool execution. Fix the server chat template/parser first. If the + model only works when tool use is forced, add the per-model + `params.extra_body.tool_choice: "required"` override above and use that model + entry only for sessions where a tool call is expected on every turn. - Safety: local models skip provider-side filters; keep agents narrow and compaction on to limit prompt injection blast radius. ## Related diff --git a/docs/providers/vllm.md b/docs/providers/vllm.md index 7b05f35e5a7..48dc533a676 100644 --- a/docs/providers/vllm.md +++ b/docs/providers/vllm.md @@ -168,16 +168,21 @@ Use explicit config when: - + First make sure vLLM was started with the right tool-call parser and chat template for the model. For example, vLLM documents `hermes` for Qwen2.5 models and `qwen3_xml` for Qwen3-Coder models. - Some Qwen/vLLM combinations still return raw tool-call text or an empty - `tool_calls` array when the request uses `tool_choice: "auto"`, but return - structured tool calls when the request uses `tool_choice: "required"`. For - those model entries, force the OpenAI-compatible request field with - `params.extra_body`: + Symptoms: + + - skills or tools never run + - the assistant prints raw JSON/XML such as `{"name":"read","arguments":...}` + - vLLM returns an empty `tool_calls` array when OpenClaw sends + `tool_choice: "auto"` + + Some Qwen/vLLM combinations return structured tool calls only when the + request uses `tool_choice: "required"`. For those model entries, force the + OpenAI-compatible request field with `params.extra_body`: ```json5 { @@ -197,9 +202,23 @@ Use explicit config when: } ``` + Replace `Qwen-Qwen2.5-Coder-32B-Instruct` with the exact id returned by: + + ```bash + openclaw models list --provider vllm + ``` + + You can apply the same override from the CLI: + + ```bash + openclaw config set agents.defaults.models '{"vllm/Qwen-Qwen2.5-Coder-32B-Instruct":{"params":{"extra_body":{"tool_choice":"required"}}}}' --strict-json --merge + ``` + This is an opt-in compatibility workaround. It makes every model turn with tools require a tool call, so use it only for a dedicated local model entry - where that behavior is acceptable. + where that behavior is acceptable. Do not use it as a global default for all + vLLM models, and do not use a proxy that blindly converts arbitrary + assistant text into executable tool calls. @@ -293,6 +312,18 @@ Use explicit config when: Auto-discovery requires `VLLM_API_KEY` to be set **and** no explicit `models.providers.vllm` config entry. If you have defined the provider manually, OpenClaw skips discovery and uses only your declared models. + + + If a Qwen model prints JSON/XML tool syntax instead of executing a skill, + check the Qwen guidance in Advanced configuration above. The usual fix is: + + - start vLLM with the correct parser/template for that model + - confirm the exact model id with `openclaw models list --provider vllm` + - add a dedicated per-model `params.extra_body.tool_choice: "required"` + override only if `tool_choice: "auto"` still returns empty or text-only + tool calls + +