fix: derive dynamic context-window guard thresholds

Derive context-window guard thresholds from the effective model window, keeping 10% hard-min and 20% warning ratios with 4k/8k floors. Stop the embedded runner from forcing old fixed guard overrides so runtime admission uses the dynamic resolver. Validation: - CI run 25151866833 passed, including build-artifacts and checks-node-channels. - Parity gate 25151866868 passed. - Testbox pnpm test:channels passed: 54 files / 433 tests. Fixes #42999. Prepared head SHA: 9c80383639
2026-05-06 16:20:43 +00:00 · 2026-04-30 02:33:43 -05:00
parent f0721452a8
commit 13e917e292
8 changed files with 147 additions and 41 deletions
--- a/docs/gateway/local-models.md
+++ b/docs/gateway/local-models.md
@@ -319,7 +319,7 @@ Compatibility notes for stricter OpenAI-compatible backends:
  OpenClaw process RSS/heap snapshot in diagnostics. For LM Studio/Ollama
  memory pressure, match that timestamp against the server log or macOS crash /
  jetsam log to confirm whether the model server was killed.
- OpenClaw warns when the detected context window is below **32k** and blocks below **16k**. If you hit that preflight, raise the server/model context limit or choose a larger model.
+- OpenClaw derives context-window preflight thresholds from the detected model window, or from the uncapped model window when `agents.defaults.contextTokens` lowers the effective window. It warns below 20% with an **8k** floor. Hard blocks use the 10% threshold with a **4k** floor, capped to the effective context window so oversized model metadata cannot reject an otherwise valid user cap. If you hit that preflight, raise the server/model context limit or choose a larger model.
 - Context errors? Lower `contextWindow` or raise your server limit.
 - OpenAI-compatible server returns `messages[].content ... expected a string`?
  Add `compat.requiresStringContent: true` on that model entry.