fix: gate max thinking by model support

2026-05-06 15:40:44 +00:00 · 2026-04-21 06:47:06 +01:00
parent f89740a62c
commit 6ce17db11a
49 changed files with 510 additions and 73 deletions
--- a/docs/concepts/active-memory.md
+++ b/docs/concepts/active-memory.md
@@ -625,7 +625,7 @@ The most important fields are:
 | `config.model`              | `string`                                                                                             | Optional blocking memory sub-agent model ref; when unset, active memory uses the current session model |
 | `config.queryMode`          | `"message" \| "recent" \| "full"`                                                                    | Controls how much conversation the blocking memory sub-agent sees                                      |
 | `config.promptStyle`        | `"balanced" \| "strict" \| "contextual" \| "recall-heavy" \| "precision-heavy" \| "preference-only"` | Controls how eager or strict the blocking memory sub-agent is when deciding whether to return memory   |
-| `config.thinking`           | `"off" \| "minimal" \| "low" \| "medium" \| "high" \| "xhigh" \| "adaptive"`                         | Advanced thinking override for the blocking memory sub-agent; default `off` for speed                  |
+| `config.thinking`           | `"off" \| "minimal" \| "low" \| "medium" \| "high" \| "xhigh" \| "adaptive" \| "max"`                | Advanced thinking override for the blocking memory sub-agent; default `off` for speed                  |
 | `config.promptOverride`     | `string`                                                                                             | Advanced full prompt replacement; not recommended for normal use                                       |
 | `config.promptAppend`       | `string`                                                                                             | Advanced extra instructions appended to the default or overridden prompt                               |
 | `config.timeoutMs`          | `number`                                                                                             | Hard timeout for the blocking memory sub-agent, capped at 120000 ms                                    |
--- a/docs/concepts/model-providers.md
+++ b/docs/concepts/model-providers.md
@@ -43,7 +43,7 @@ For model selection rules, see [/concepts/models](/concepts/models).
  `matchesContextOverflowError`, `classifyFailoverReason`,
  `isCacheTtlEligible`, `buildMissingAuthMessage`, `suppressBuiltInModel`,
  `augmentModelCatalog`, `isBinaryThinking`, `supportsXHighThinking`,
-  `supportsAdaptiveThinking`,
+  `supportsAdaptiveThinking`, `supportsMaxThinking`,
  `resolveDefaultThinkingLevel`, `applyConfigDefaults`, `isModernModelRef`,
  `prepareRuntimeAuth`, `resolveUsageAuth`, `fetchUsageSnapshot`, and
  `onModelSelected`.
@@ -135,6 +135,7 @@ Typical split:
 - `isBinaryThinking`: provider owns binary on/off thinking UX
 - `supportsXHighThinking`: provider opts selected models into `xhigh`
 - `supportsAdaptiveThinking`: provider opts selected models into `adaptive`
+- `supportsMaxThinking`: provider opts selected models into `max`
 - `resolveDefaultThinkingLevel`: provider owns default `/think` policy for a
  model family
 - `applyConfigDefaults`: provider applies provider-specific global defaults
--- a/docs/gateway/configuration-reference.md
+++ b/docs/gateway/configuration-reference.md
@@ -1791,7 +1791,7 @@ scripts/sandbox-browser-setup.sh   # optional browser image
 - `model`: string form overrides `primary` only; object form `{ primary, fallbacks }` overrides both (`[]` disables global fallbacks). Cron jobs that only override `primary` still inherit default fallbacks unless you set `fallbacks: []`.
 - `params`: per-agent stream params merged over the selected model entry in `agents.defaults.models`. Use this for agent-specific overrides like `cacheRetention`, `temperature`, or `maxTokens` without duplicating the whole model catalog.
 - `skills`: optional per-agent skill allowlist. If omitted, the agent inherits `agents.defaults.skills` when set; an explicit list replaces defaults instead of merging, and `[]` means no skills.
- `thinkingDefault`: optional per-agent default thinking level (`off | minimal | low | medium | high | xhigh | adaptive`). Overrides `agents.defaults.thinkingDefault` for this agent when no per-message or session override is set.
+- `thinkingDefault`: optional per-agent default thinking level (`off | minimal | low | medium | high | xhigh | adaptive | max`). Overrides `agents.defaults.thinkingDefault` for this agent when no per-message or session override is set.
 - `reasoningDefault`: optional per-agent default reasoning visibility (`on | off | stream`). Applies when no per-message or session reasoning override is set.
 - `fastModeDefault`: optional per-agent default for fast mode (`true | false`). Applies when no per-message or session fast-mode override is set.
 - `embeddedHarness`: optional per-agent low-level harness policy override. Use `{ runtime: "codex", fallback: "none" }` to make one agent Codex-only while other agents keep the default PI fallback.
--- a/docs/plugins/architecture.md
+++ b/docs/plugins/architecture.md
@@ -659,6 +659,7 @@ Provider plugins now have two layers:
  `classifyFailoverReason`, `isCacheTtlEligible`,
  `buildMissingAuthMessage`, `suppressBuiltInModel`, `augmentModelCatalog`,
  `isBinaryThinking`, `supportsXHighThinking`, `supportsAdaptiveThinking`,
+  `supportsMaxThinking`,
  `resolveDefaultThinkingLevel`, `isModernModelRef`, `prepareRuntimeAuth`,
  `resolveUsageAuth`, `fetchUsageSnapshot`, `createEmbeddingProvider`,
  `buildReplayPolicy`,
@@ -725,16 +726,17 @@ The "When to use" column is the quick decision guide.
 | 33  | `isBinaryThinking`                | On/off reasoning toggle for binary-thinking providers                                                          | Provider exposes only binary thinking on/off                                                                                                |
 | 34  | `supportsXHighThinking`           | `xhigh` reasoning support for selected models                                                                  | Provider wants `xhigh` on only a subset of models                                                                                           |
 | 35  | `supportsAdaptiveThinking`        | `adaptive` thinking support for selected models                                                                | Provider wants `adaptive` shown only for models with provider-managed adaptive thinking                                                     |
-| 36  | `resolveDefaultThinkingLevel`     | Default `/think` level for a specific model family                                                             | Provider owns default `/think` policy for a model family                                                                                    |
-| 37  | `isModernModelRef`                | Modern-model matcher for live profile filters and smoke selection                                              | Provider owns live/smoke preferred-model matching                                                                                           |
-| 38  | `prepareRuntimeAuth`              | Exchange a configured credential into the actual runtime token/key just before inference                       | Provider needs a token exchange or short-lived request credential                                                                           |
-| 39  | `resolveUsageAuth`                | Resolve usage/billing credentials for `/usage` and related status surfaces                                     | Provider needs custom usage/quota token parsing or a different usage credential                                                             |
-| 40  | `fetchUsageSnapshot`              | Fetch and normalize provider-specific usage/quota snapshots after auth is resolved                             | Provider needs a provider-specific usage endpoint or payload parser                                                                         |
-| 41  | `createEmbeddingProvider`         | Build a provider-owned embedding adapter for memory/search                                                     | Memory embedding behavior belongs with the provider plugin                                                                                  |
-| 42  | `buildReplayPolicy`               | Return a replay policy controlling transcript handling for the provider                                        | Provider needs custom transcript policy (for example, thinking-block stripping)                                                             |
-| 43  | `sanitizeReplayHistory`           | Rewrite replay history after generic transcript cleanup                                                        | Provider needs provider-specific replay rewrites beyond shared compaction helpers                                                           |
-| 44  | `validateReplayTurns`             | Final replay-turn validation or reshaping before the embedded runner                                           | Provider transport needs stricter turn validation after generic sanitation                                                                  |
-| 45  | `onModelSelected`                 | Run provider-owned post-selection side effects                                                                 | Provider needs telemetry or provider-owned state when a model becomes active                                                                |
+| 36  | `supportsMaxThinking`             | `max` reasoning support for selected models                                                                    | Provider wants `max` shown only for models with provider max thinking                                                                       |
+| 37  | `resolveDefaultThinkingLevel`     | Default `/think` level for a specific model family                                                             | Provider owns default `/think` policy for a model family                                                                                    |
+| 38  | `isModernModelRef`                | Modern-model matcher for live profile filters and smoke selection                                              | Provider owns live/smoke preferred-model matching                                                                                           |
+| 39  | `prepareRuntimeAuth`              | Exchange a configured credential into the actual runtime token/key just before inference                       | Provider needs a token exchange or short-lived request credential                                                                           |
+| 40  | `resolveUsageAuth`                | Resolve usage/billing credentials for `/usage` and related status surfaces                                     | Provider needs custom usage/quota token parsing or a different usage credential                                                             |
+| 41  | `fetchUsageSnapshot`              | Fetch and normalize provider-specific usage/quota snapshots after auth is resolved                             | Provider needs a provider-specific usage endpoint or payload parser                                                                         |
+| 42  | `createEmbeddingProvider`         | Build a provider-owned embedding adapter for memory/search                                                     | Memory embedding behavior belongs with the provider plugin                                                                                  |
+| 43  | `buildReplayPolicy`               | Return a replay policy controlling transcript handling for the provider                                        | Provider needs custom transcript policy (for example, thinking-block stripping)                                                             |
+| 44  | `sanitizeReplayHistory`           | Rewrite replay history after generic transcript cleanup                                                        | Provider needs provider-specific replay rewrites beyond shared compaction helpers                                                           |
+| 45  | `validateReplayTurns`             | Final replay-turn validation or reshaping before the embedded runner                                           | Provider transport needs stricter turn validation after generic sanitation                                                                  |
+| 46  | `onModelSelected`                 | Run provider-owned post-selection side effects                                                                 | Provider needs telemetry or provider-owned state when a model becomes active                                                                |

 `normalizeModelId`, `normalizeTransport`, and `normalizeConfig` first check the
 matched provider plugin, then fall through other hook-capable provider plugins
@@ -806,7 +808,7 @@ api.registerProvider({

 - Anthropic uses `resolveDynamicModel`, `capabilities`, `buildAuthDoctorHint`,
  `resolveUsageAuth`, `fetchUsageSnapshot`, `isCacheTtlEligible`,
-  `supportsAdaptiveThinking`, `resolveDefaultThinkingLevel`, `applyConfigDefaults`, `isModernModelRef`,
+  `supportsAdaptiveThinking`, `supportsMaxThinking`, `resolveDefaultThinkingLevel`, `applyConfigDefaults`, `isModernModelRef`,
  and `wrapStreamFn` because it owns Claude 4.6 forward-compat,
  provider-family hints, auth repair guidance, usage endpoint integration,
  prompt-cache eligibility, auth-aware config defaults, Claude
--- a/docs/plugins/sdk-provider-plugins.md
+++ b/docs/plugins/sdk-provider-plugins.md
@@ -536,16 +536,17 @@ API key auth, and dynamic model resolution.
      | 32 | `isBinaryThinking` | Binary thinking on/off |
      | 33 | `supportsXHighThinking` | `xhigh` reasoning support |
      | 34 | `supportsAdaptiveThinking` | Adaptive thinking support |
-      | 35 | `resolveDefaultThinkingLevel` | Default `/think` policy |
-      | 36 | `isModernModelRef` | Live/smoke model matching |
-      | 37 | `prepareRuntimeAuth` | Token exchange before inference |
-      | 38 | `resolveUsageAuth` | Custom usage credential parsing |
-      | 39 | `fetchUsageSnapshot` | Custom usage endpoint |
-      | 40 | `createEmbeddingProvider` | Provider-owned embedding adapter for memory/search |
-      | 41 | `buildReplayPolicy` | Custom transcript replay/compaction policy |
-      | 42 | `sanitizeReplayHistory` | Provider-specific replay rewrites after generic cleanup |
-      | 43 | `validateReplayTurns` | Strict replay-turn validation before the embedded runner |
-      | 44 | `onModelSelected` | Post-selection callback (e.g. telemetry) |
+      | 35 | `supportsMaxThinking` | `max` reasoning support |
+      | 36 | `resolveDefaultThinkingLevel` | Default `/think` policy |
+      | 37 | `isModernModelRef` | Live/smoke model matching |
+      | 38 | `prepareRuntimeAuth` | Token exchange before inference |
+      | 39 | `resolveUsageAuth` | Custom usage credential parsing |
+      | 40 | `fetchUsageSnapshot` | Custom usage endpoint |
+      | 41 | `createEmbeddingProvider` | Provider-owned embedding adapter for memory/search |
+      | 42 | `buildReplayPolicy` | Custom transcript replay/compaction policy |
+      | 43 | `sanitizeReplayHistory` | Provider-specific replay rewrites after generic cleanup |
+      | 44 | `validateReplayTurns` | Strict replay-turn validation before the embedded runner |
+      | 45 | `onModelSelected` | Post-selection callback (e.g. telemetry) |

      Prompt tuning note:

--- a/docs/providers/mistral.md
+++ b/docs/providers/mistral.md
@@ -95,7 +95,7 @@ The media transcription path uses `/v1/audio/transcriptions`. The default audio
    | OpenClaw thinking level                          | Mistral `reasoning_effort` |
    | ------------------------------------------------ | -------------------------- |
    | **off** / **minimal**                            | `none`                     |
-    | **low** / **medium** / **high** / **xhigh** / **adaptive** | `high`             |
+    | **low** / **medium** / **high** / **xhigh** / **adaptive** / **max** | `high`     |

    <Note>
    Other bundled Mistral catalog models do not use this parameter. Keep using `magistral-*` models when you want Mistral's native reasoning-first behavior.
--- a/docs/tools/thinking.md
+++ b/docs/tools/thinking.md
@@ -10,20 +10,23 @@ title: "Thinking Levels"
 ## What it does

 - Inline directive in any inbound body: `/t <level>`, `/think:<level>`, or `/thinking <level>`.
- Levels (aliases): `off | minimal | low | medium | high | xhigh | adaptive`
+- Levels (aliases): `off | minimal | low | medium | high | xhigh | adaptive | max`
  - minimal → “think”
  - low → “think hard”
  - medium → “think harder”
  - high → “ultrathink” (max budget)
  - xhigh → “ultrathink+” (GPT-5.2 + Codex models and Anthropic Claude Opus 4.7 effort)
  - adaptive → provider-managed adaptive thinking (supported for Claude 4.6 on Anthropic/Bedrock and Anthropic Claude Opus 4.7)
+  - max → provider max reasoning (currently Anthropic Claude Opus 4.7)
  - `x-high`, `x_high`, `extra-high`, `extra high`, and `extra_high` map to `xhigh`.
-  - `highest`, `max` map to `high`.
+  - `highest` maps to `high`.
 - Provider notes:
  - `adaptive` is only advertised in native command menus and pickers for providers/models that declare adaptive thinking support. It remains accepted as a typed directive for compatibility with existing configs and aliases.
+  - `max` is only advertised in native command menus and pickers for providers/models that declare max thinking support. Existing stored `max` settings are remapped to the largest supported level for the selected model when the model does not support `max`.
  - Anthropic Claude 4.6 models default to `adaptive` when no explicit thinking level is set.
  - Anthropic Claude Opus 4.7 does not default to adaptive thinking. Its API effort default remains provider-owned unless you explicitly set a thinking level.
  - Anthropic Claude Opus 4.7 maps `/think xhigh` to adaptive thinking plus `output_config.effort: "xhigh"`, because `/think` is a thinking directive and `xhigh` is the Opus 4.7 effort setting.
+  - Anthropic Claude Opus 4.7 also exposes `/think max`; it maps to the same provider-owned max effort path.
  - OpenAI GPT models map `/think` through model-specific Responses API effort support. `/think off` sends `reasoning.effort: "none"` only when the target model supports it; otherwise OpenClaw omits the disabled reasoning payload instead of sending an unsupported value.
  - MiniMax (`minimax/*`) on the Anthropic-compatible streaming path defaults to `thinking: { type: "disabled" }` unless you explicitly set thinking in model params or request params. This avoids leaked `reasoning_content` deltas from MiniMax's non-native Anthropic stream format.
  - Z.AI (`zai/*`) only supports binary thinking (`on`/`off`). Any non-`off` level is treated as `on` (mapped to `low`).
@@ -112,6 +115,6 @@ title: "Thinking Levels"
 - The picker stays provider-aware:
  - most providers show `off | minimal | low | medium | high`
  - Anthropic/Bedrock Claude 4.6 shows `off | minimal | low | medium | high | adaptive`
-  - Anthropic Claude Opus 4.7 shows `off | minimal | low | medium | high | xhigh | adaptive`
+  - Anthropic Claude Opus 4.7 shows `off | minimal | low | medium | high | xhigh | adaptive | max`
  - Z.AI shows binary `off | on`
 - `/think:<level>` still works and updates the same stored session level, so chat directives and the picker stay in sync.