Cloudflare challenge pages from chatgpt.com/backend-api can arrive as raw HTML without an HTTP status prefix. The transport sanitizer scanned for generic "dns" substrings before HTML detection, so these pages could surface as DNS lookup failures instead of the existing HTML/CDN block message.
Constraint: Must preserve DNS transport classification for real ENOTFOUND/getaddrinfo failures
Rejected: Treat every bare HTML document as an upstream HTML error | too broad for arbitrary model text/errors
Confidence: high
Scope-risk: narrow
Directive: Keep standalone HTML challenge detection ahead of generic transport keyword matching so CDN block pages do not regress into DNS copy
Tested: oxfmt --check on changed files; targeted node --import tsx verification for standalone Cloudflare HTML classification and DNS control case
Not-tested: Full Vitest shard run in this environment
Addresses review feedback: localeCompare without a fixed locale uses the
runtime default, which varies across servers. Pinning 'en' ensures
byte-identical prompts for cache stability. Applied at all three sort
points in workspace.ts.
Sort the merged skill entries by name before rendering into the
available_skills prompt block. Previously the order depended on
Map insertion order which varies with skills.load.extraDirs config,
causing identical deployments to produce different prompts and bypass
LLM prompt caching.
Two sort points added:
1. loadSkillEntries — canonical ordering at the source
2. resolveWorkspaceSkillPromptState — ensures prompt stability even
when callers pass pre-built entry arrays
Fixes#64167
* fix(agents): preserve native Anthropic tool IDs for hybrid providers
Fixes#66892
MiniMax and other hybrid providers use api.minimaxi.com/anthropic
(modelApi: anthropic-messages), which generates and expects native
Anthropic tool_call_ids in toolu_* format. The hybrid replay policy
(buildHybridAnthropicOrOpenAIReplayPolicy) applied strict
sanitization that stripped underscores from these IDs, causing
MiniMax to reject them with error 2013.
The native Anthropic provider already preserved these IDs via
preserveNativeAnthropicToolUseIds (added in 4613f121ad). This
commit enables the same flag for the hybrid anthropic-messages
branch, so toolu_* IDs pass through unsanitized while other
synthetic IDs still get strict cleanup.
* fix(agents): repair sanitized replay tool results before send
* fix: repair sanitized replay tool results before send (#67620) (thanks @stainlu)
* fix: preserve aborted-span tool results during replay sanitize (#67620) (thanks @stainlu)
---------
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
* fix(agents): classify Cloudflare/CDN HTML error pages as transport failures
Fixes#67517
When a provider endpoint returns an HTML error page (e.g. Cloudflare
502/503/520-524), the pattern-based message classifiers would scan
the HTML body and misinterpret embedded text like "Rate limit
exceeded" as a structured rate_limit API error. This caused
incorrect failover behavior (profile rotation instead of clean
retry/fallback) and left the TUI stuck.
Two fixes:
1. classifyFailoverSignal now short-circuits on HTML responses
before running pattern matchers, returning "timeout" (transport
failure) so retry/fallback handles them correctly.
2. classifyProviderRuntimeFailureKind now detects HTML errors at
any status (not just 403), returning "upstream_html" for
non-403 statuses with a clear user-facing message about
CDN/gateway errors.
Adds regression tests covering Cloudflare 502/503 HTML with
embedded rate-limit text, 403 HTML (still classified as auth),
and JSON rate-limit responses (still classified correctly).
* fix: preserve auth and proxy HTML classification
* fix: classify HTML provider error pages correctly (#67642) (thanks @stainlu)
---------
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
* fix(tools): expand tilde in host edit/write paths (non-workspace mode)
* test: use it.runIf for visible skip when tmpdir is not under home
* fix(tools): address Codex P2 review on tilde host edit/write
Responds to two P2 findings from chatgpt-codex-connector on #62804:
1. Tests never ran in CI. The it.runIf(tmpdirUnderHome) guard always
skipped on Linux runners where os.tmpdir() is /tmp, outside $HOME, so
the regression tests reported green without executing. Tmpdirs now use
the test-isolated HOME (process.env.HOME from test/test-env.ts) so
tests run in every environment and match what expandHomePrefix
resolves, keeping them hermetic.
2. Edit recovery path resolution was inconsistent. resolveEditPath
inlined os.homedir() for tilde expansion, bypassing OPENCLAW_HOME,
while the write/edit operations use expandHomePrefix. Under a custom
OPENCLAW_HOME, wrapEditToolWithRecovery's readback targeted a
different file than the edit actually touched, so successful edits
could be reported as failures. resolveEditPath now uses the same
expandHomePrefix helper.
* test(tools): verify tilde expansion honors OPENCLAW_HOME override
The prior tests covered tilde expansion but only under the default test
home, which matches os.homedir(). That passed whether the production code
used expandHomePrefix() or inlined os.homedir() — the behaviors only
diverge when OPENCLAW_HOME is set to a path outside $HOME.
Adds four tests that set OPENCLAW_HOME to a temp dir explicitly outside
$HOME and verify that write/mkdir/read/access tilde operations resolve
against OPENCLAW_HOME, not os.homedir(). These would fail if
pi-tools.read.ts or pi-tools.host-edit.ts reverted to os.homedir(),
directly covering the Codex P2 feedback about OPENCLAW_HOME consistency.
Uses the same env snapshot/restore pattern as test/helpers/temp-home.ts.
* Agents: resolve host tilde paths against OS home
* fix: align host tilde paths with OS home (#62804) (thanks @stainlu)
* fix: keep the changelog entry in the active block (#62804) (thanks @stainlu)
---------
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
* fix: tighten trusted tool media passthrough
* changelog: tighten trusted tool media passthrough (#67303)
* address review: thread rawToolName into emitToolResultOutput and keep plugin-tool media passthrough
- Pass rawToolName through emitToolResultOutput params so the emit and
collect calls no longer reference an out-of-scope identifier
(ReferenceError on any verbose tool-output path).
- Widen builtinToolNames to all effective tool raw names for this run
(core + bundled/trusted plugin tools), so plugin tools on the trusted
media list still receive local MEDIA: passthrough. Admission-time
client-tool conflict check keeps using the core-only set so unrelated
plugin names do not spuriously reject client definitions; MEDIA
passthrough is still gated by the raw-name set, so a client tool that
normalize-collides with a plugin name cannot inherit its media trust.
- Add unit coverage for bundled-plugin raw-name passthrough and for
case-variant plugin-name collisions.
* drop redundant String() casts flagged by oxlint no-useless-cast
The names from effectiveTools, client tool function names, and the
existingToolNames iterable are already typed as string, so wrapping them
in String(...) adds nothing and trips oxlint's no-useless-cast rule.
* fix(openrouter): handle reasoning_details field in Qwen3 stream parsing
Add support for the reasoning_details field returned by OpenRouter/Qwen3
models. Previously this field was not recognized, causing payloads=0 and
incomplete turn errors.
- Add reasoning_details handling in processOpenAICompletionsStream
- Extract text from reasoning_details array items with type reasoning.text
- Treat as thinking content, similar to other reasoning fields
- Add test case for reasoning_details handling
Fixes#66833
* fix(openrouter): keep tool calls with reasoning_details
* fix: handle OpenRouter Qwen3 reasoning_details streams (#66905) (thanks @bladin)
* fix: preserve streamed tool calls with reasoning deltas (#66905) (thanks @bladin)
---------
Co-authored-by: bladin <bladin@users.noreply.github.com>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
* fix(audio): restore allowPrivateNetwork for self-hosted STT endpoints
resolveProviderExecutionContext built the request object passed to
transcribeAudio using only sanitizeConfiguredProviderRequest on the
tool-level config and entry — which strips allowPrivateNetwork. The
provider-level request config (models.providers.*.request) was never
included in the merge, so allowPrivateNetwork:true was silently dropped.
Additionally, resolveProviderRequestPolicyConfig only read allowPrivate
Network from params.allowPrivateNetwork (a direct parameter) and ignored
params.request?.allowPrivateNetwork even when it was present.
Fix both gaps:
- runner.entries.ts: use mergeModelProviderRequestOverrides with
sanitizeConfiguredModelProviderRequest(providerConfig?.request) so
models.providers.*.request.allowPrivateNetwork flows through to the
media execution context
- provider-request-config.ts: fall back to params.request?.allowPrivate
Network when params.allowPrivateNetwork is undefined
Fixes#66691. Regression introduced in v2026.4.14.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(media-understanding): assert allowPrivateNetwork flows through resolveProviderExecutionContext
Regression test for the bug where providerConfig.request.allowPrivateNetwork
was dropped when building the AudioTranscriptionRequest passed to media
providers. Verifies that setting allowPrivateNetwork in the provider config
reaches the provider's request object after the fix to use
mergeModelProviderRequestOverrides + sanitizeConfiguredModelProviderRequest.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* test(media-understanding): tighten allowPrivateNetwork regression types
* fix: restore allowPrivateNetwork for self-hosted STT endpoints (#66692) (thanks @jhsmith409)
---------
Co-authored-by: Jim Smith <jhsmith0@me.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
* fix(context-engine): pass deferred maintenance token budget
Thread tokenBudget through the after-turn runtime context so background context-engine maintenance reuses the real model context window instead of falling back to 128k. Also pass through a best-effort currentTokenCount from the latest call total and make the runtime context type explicit about both fields.
Regeneration-Prompt: |
OpenClaw already passed the real context token budget into direct context-engine calls like afterTurn and assemble, but deferred maintain() reused only the runtimeContext object and that object did not carry tokenBudget. Lossless Claw therefore fell back to 128k during background maintenance, which made budget-trigger fire much more aggressively than the live model context warranted. Thread the real contextTokenBudget into buildAfterTurnRuntimeContext so deferred maintenance receives the same budget, and pass a straightforward best-effort currentTokenCount from the latest call total while the relevant data is already in scope. Keep the change additive, update the runtime-context type, and cover the background maintenance/runtime-context behavior with focused tests.
* fix(context-engine): use prompt usage for deferred maintenance
* Docs: add Anthropic max_tokens investigation memo
Regeneration-Prompt: |
Investigate the reported OpenClaw cron isolated-agent failure where an
Anthropic Haiku run returned "max_tokens: must be greater than or equal to 1".
Do not implement a fix yet. Inspect the cron isolated-agent execution path,
the embedded runner, extra param plumbing, Anthropic transport code, and any
model-selection or token-budget logic that could synthesize maxTokens = 0.
Produce a concise maintainer memo with concrete file references, explain why
cron itself is not the component setting maxTokens, identify the most likely
root cause, describe the smallest repro shape, and recommend the cleanest fix.
* openclaw-e82: guard Anthropic Messages maxTokens
Regeneration-Prompt: |
Fix the Anthropic Messages path so OpenClaw never sends max_tokens <= 0
to Anthropic. Match the positive-number guard already used by the
Anthropic Vertex transport, but keep the change scoped: validate token
limits in src/agents/anthropic-transport-stream.ts where transport
options are resolved and where the final payload is assembled, fall back
to the model limit when a runtime override is zero, fail locally when no
positive token budget exists, and drop non-positive maxTokens from
src/agents/pi-embedded-runner/extra-params.ts so hidden config params do
not leak through. Add focused regression coverage for both the transport
and extra-param forwarding path, and remove the earlier investigation memo
from the branch so the PR diff only contains the fix.
* fix: scope Anthropic max token guard
* fix: document Anthropic max token guard
* fix: floor Anthropic max token overrides