* fix: harden package URL downloads
Guard package acceptance URL downloads with HTTPS-only validation, no embedded credentials, private/special-use DNS and IP rejection, manual redirect checks, bounded timeout/size limits, pinned lookup, and atomic temp-file writes. Add tooling tests for unsafe URLs, redirect validation, size limits, and successful writes.
* fix: cancel redirect response bodies before closing dispatcher
ClawSweeper P2: the redirect branch in openPackageDownloadResponse cleared
the timeout and awaited dispatcher.close() without first cancelling
response.body. Undici's close() is graceful — it waits for in-flight
requests to complete — so a malicious redirect with a slow/never-ending
body could hang the hardened downloader.
Fix: call response.body?.cancel() before dispatcher.close() to abort the
redirect body immediately.
Test: add a regression test that uses a ReadableStream with an indefinite
interval to simulate a hanging body, and asserts cancel() was called.
Refs: clawsweeper review on PR #85512
* test: harden redirect body cancellation race in regression test
Guard the ReadableStream controller.enqueue() call with a cancelled
flag and try/catch to prevent ERR_INVALID_STATE when the interval
fires after cancel() closes the controller.
* fix: cancel final response body before closing dispatcher in downloadUrl
ClawSweeper P2: the HTTP-error and declared-oversize early-exit paths
in downloadUrl threw before consuming or canceling response.body. The
finally block then cleared the timeout and awaited graceful
dispatcher.close() with the body still open, allowing a slow/never-ending
response to hang release tooling.
Fix: add response.body?.cancel() in the finally block before
dispatcher.close().
Tests: add two regressions:
- HTTP 500 with slow body: asserts cancel() called before dispatcher close
- Declared content-length oversize with slow body: same assertion
* fix: add trusted package URL source policy
* fix: keep package URL resolver dependency-free
* test: cover encoded IPv6 package URL bypasses
* docs: sync package acceptance source overview
* docs: restore release doc formatting
* docs: sync package acceptance trusted-url source
* test: cover dotted IPv4 embedded IPv6 package URLs
* fix: parse dotted IPv4 embedded in IPv6 package URLs
* test: isolate anthropic pruning defaults
* test: move anthropic dated model coverage
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(exec-approvals): add .catch() to expiry delivery fire-and-forget
When exec-approval expiry fires, deliverToTargets is called as a
fire-and-forget promise with no .catch(). If delivery fails, the
unhandled rejection swallows the error and the notification is lost.
Add .catch() with log.warn to match the ackDelivery error handling
pattern. Keep pending.delete() before the await (the entry is expired
regardless of delivery success).
Closes#83113
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(approvals): label expiry delivery errors by kind
---------
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(doctor): skip empty entries and memoize routes in plugin session repairs
runPluginSessionStateDoctorRepairs called resolveConfiguredDoctorSessionStateRoute
once per session-store key, even for entries that carry no plugin route state
fields. On stores with many CLI sessions (observed ~800 entries), each call
takes ~1.5s due to resolveAgentHarnessPolicy walking config and provider
metadata, so the doctor's state-integrity contribution hangs for minutes
and the surrounding 'openclaw doctor' run effectively never completes.
scanEntryForOwner can only produce repair/manual-review findings when the
entry exposes one of the fields covered by entryMayContainPluginSessionRouteState
(providerOverride/modelOverride/agentHarnessId/cliSessionBindings/etc.), so
the route resolution for empty entries was pure waste. The route itself is
also a function of agentId (sessionKey is only used to derive agentId), so
sessions sharing an agent can reuse one resolved route.
Filter the store by entryMayContainPluginSessionRouteState before resolving,
and memoize resolveConfiguredDoctorSessionStateRoute by agentId within the
remaining entries. On the repro store this drops the contribution from
'never completes' to <100ms.
Adds a guard test that builds a 200-entry store with 2 route-state-carrying
entries and asserts (a) the repair fires exactly once on the codex owner
and (b) the run completes in under 2s (pre-fix would take >5 minutes).
* fix(doctor): skip manifest model-id normalization in plugin session repairs
After the previous filter+memoize fix, runPluginSessionStateDoctorRepairs was
still ~38s on a 230-entry store because every scanned entry calls parseModelRef
on its runtime model. That implicitly enters manifest-driven model-id
normalization via normalizeStaticProviderModelId, which calls
loadPluginMetadataSnapshot when no current snapshot is bound to process state.
loadPluginMetadataSnapshot is filesystem-heavy and is only memoized when a
'current' snapshot is bound (it is not, during doctor), so each parseModelRef
call paid ~40ms of fresh plugin-metadata loading. 672 calls × ~40ms = ~27s
of doctor wall-clock, all of it useless for doctor's purposes: the scan only
needs the normalized provider id of the configured runtime/route to compare
against an owner's providerIds, never the manifest-normalized model id.
Pass allowManifestNormalization: false alongside the existing
allowPluginNormalization: false on all three parseModelRef call sites in
this file. normalizeStaticProviderModelId short-circuits to
normalizeBuiltInProviderModelId when allowManifestNormalization is false,
which is what doctor wants here.
On the same 230-entry store doctor:state-integrity drops from ~38s to ~2.4s
and total openclaw doctor wall-clock drops from ~91s to ~56s.
Consume the existing { text, changed } signal from
stripInlineDirectiveTagsForDisplay so unchanged text-parts keep their
references and the original message is returned when nothing was
stripped. Avoids spurious downstream rerenders/diff churn for consumers
relying on reference equality, and keeps the public SDK helper's text
output and message shape stable.
Fixes#37589.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
openai-codex-responses can return turns where usage.output > 0 but
assistantTexts is empty (hidden reasoning tokens only). The empty
response retry guard only covered openai-completions, anthropic-messages,
and Ollama, so these turns passed through as successful completions
with no content delivered to the user.
Add the full openai-responses API family (openai-responses,
openai-codex-responses, azure-openai-responses, and their transport
variants) to RETRY_GUARD_MODEL_APIS so the empty response and
reasoning-only retry paths can fire for these providers.
Closes#85364
Signed-off-by: Sebastien Tardif <sebtardif@ncf.ca>
* fix(status): show configured cost for aws-sdk models
Decouple status cost display from provider auth mode so explicit model pricing is used for Bedrock and other non-api-key providers. Include cache read/write tokens in the status cost estimate and cover the behavior with regression tests.
* fix: show configured response usage costs
* docs: align configured cost visibility
* fix(status): keep usage tokens mode cost-free
---------
Co-authored-by: ItsOtherMauridian <165866613+ItsOtherMauridian@users.noreply.github.com>
Co-authored-by: ItsOtherMauridian <itsothermauridian@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Defer Gateway channel startup until after readiness, remove startup model prewarm, and move model catalog data onto manifest/static paths so startup no longer loads broad provider runtimes.
Verification:
- focused gateway/catalog/auth/QA Vitest runs
- autoreview clean
- Blacksmith Testbox-through-Crabbox tbx_01ksahn65rsrsqz3q1qyxwf929: pnpm check:changed, exit 0
- PR CI green on ee2b631c72
* fix(gateway): normalize explicit state dir overrides at startup
* test(gateway): simplify state-dir startup coverage
* test: fix state dir startup coverage
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Route cron announce topic target parsing through channel plugin target parsers instead of Telegram-specific cron core code. Keep supported Telegram topic forms in the Telegram plugin and document the channel-owned shorthand.
* fix(bootstrap): guard bootstrap name checks against undefined names
Add optional chaining to isAgentsBootstrapFile and isAgentsBootstrapName
to prevent TypeError: Cannot read properties of undefined (reading 'toLowerCase')
when bootstrap file entries have undefined name properties.
This crash was observed in 2026.5.20 where a workspace bootstrap file entry
with an undefined name caused every incoming message to fail during bootstrap
context building, completely blocking all agent replies.
Fixes#85523
* test(agents): cover unnamed bootstrap truncation entries
* test(agents): keep bootstrap truncation fixture typed
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
`waitForever()` is a public library export used by long-running embeds to
block until the host process is asked to exit. It called `interval.unref()`
on the keep-alive timer, which removes the timer from Node's active-handle
set. With no other ref'd handles, `await waitForever()` exits the process
in ~3ms with exit code 13 ("unsettled top-level await") instead of waiting.
Drop the `.unref()` so the interval actually keeps the loop alive, and
update the existing unit test (and comment) to lock in the new contract.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
* fix(cli-output): ignore cumulative usage from result events in stream-json parser
Claude-cli's stream-json result event reports cumulative cache_read across
all tool sub-calls, not the per-call value. The parser was overwriting the
last assistant-event usage with this inflated sum, causing sessionEntry.totalTokens
to climb 6-13x on tool-heavy turns and trip the preemptive-compaction gate.
Fix: skip reading usage from result events in createCliJsonlStreamingParser,
keeping the last per-call usage from assistant events instead.
Fixes#85573
* fix(agents): keep Claude result usage as fallback
* fix(agents): read Claude assistant stream usage
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Fixes#83883.
In `secrets configure`, the one-way-migration irreversibility warning was
computed from `opts.apply` (the original --apply flag) rather than
`shouldApply`. On the interactive path the user confirms "Apply this plan
now?", which sets shouldApply=true while opts.apply stays false, so the
warning was silently skipped and the irreversible plaintext migration was
applied without the second confirmation.
Derive the guard from shouldApply so the irreversibility warning fires on
both the --apply path and the interactive-confirm path. Adds regression
tests covering the interactive path (warning shown; declining it cancels
the apply).
* fix(agents/harness): pass CLI runtime aliases through to PI in selectAgentHarnessDecision
When a model defines `agentRuntime.id` as a CLI runtime alias
(`claude-cli`, `google-gemini-cli`) or a configured `cliBackends` id, the
explicit-non-`auto` branch of `selectAgentHarnessDecision` previously
threw `MissingAgentHarnessError` because the alias has no agent harness
plugin counterpart. Model dispatch is unaffected (the CLI-runtime
short-circuit in `assertModelFallbackCandidateHarnessAvailable` runs
first), but every non-dispatch caller — delivery-mirror metadata
lookups, lane preflight, channel projection — surfaces the throw. On
Slack `[[reply_to:]]` deliveries the warning text gets substituted into
the assistant message synthesized as `provider: openclaw,
model: gateway-injected`, poisoning the thread.
Mirror the existing implicit-codex escape hatch in the same function:
when the runtime is a CLI alias (`isCliRuntimeAlias`) or a configured
CLI backend (`isCliProvider`), return PI with the new
`selectedReason: "cli_runtime_passthrough_pi"`. Actual CLI dispatch is
already routed by callers that consult model runtime policy, so PI here
is just a transcript-composition placeholder — non-CLI typos still
throw as before.
Refs #85582.
* fix(agents): validate CLI harness aliases by provider
* fix(agents): keep custom CLI harness ids fail-closed
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* docs(auth): document named OAuth profile logins
* feat(auth): support --profile-id in models auth login
* docs: note named model login profiles
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Restores WebChat image uploads to the media-understanding flow without one-turn model overrides.
- removes image-model override plumbing from the reply run
- stages WebChat images as MediaPaths for enrichment
- avoids replaying already-understood images to text-only reply models while preserving undescribed images
Co-authored-by: NianJiuZst <3235467914@qq.com>
* feat(anthropic): migrate 1M context from beta to GA
Anthropic has graduated the 1M context window from beta to GA.
This commit:
- Stops injecting the context-1m-2025-08-07 beta header when
context1m: true is configured
- Removes the OAuth token skip logic that was needed because
Anthropic previously rejected the context-1m beta with OAuth auth
(OAuth now supports 1M natively)
- Strips the legacy beta header from user-configured anthropicBeta
arrays to prevent sending a stale header
- Removes the now-unused isAnthropic1MModel helper,
ANTHROPIC_1M_MODEL_PREFIXES constant, and logger import from
the stream wrappers
The context1m config param continues to be respected for context
window sizing in context.ts — only the beta header injection is
removed.
Closes#45550 (Phase 1)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(anthropic): migrate 1M context handling to GA
* fix(clownfish): address review for ghcrawl-156721-autonomous-smoke (1)
* fix(anthropic): restrict ga 1m context models
* docs(anthropic): align ga 1m context guidance
* fix(anthropic): normalize ga 1m model metadata
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: vincentkoc <25068+vincentkoc@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(twitch): preserve newer message handler during cleanup
Fixes#83888.
`TwitchClientManager.onMessage` returns a cleanup closure that called
`messageHandlers.delete(key)` unconditionally. When a second onMessage()
for the same account replaced the handler, running the earlier cleanup
deleted the newer handler, leaving the account with no handler and
silently dropping all inbound messages.
Guard the delete with a referential check so the cleanup only removes
the handler it registered. Adds regression tests covering both the
stale-cleanup case (newer handler must survive) and the normal case
(current handler is still removed).
* fix(twitch): distinguish handler registrations
* fix(signal): avoid dangling test export name
* test(meeting-notes): use public sdk imports
* test(sdk): classify meeting-notes subpath
* fix(discord): keep channel entrypoint imports narrow
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Use the passive backend Gateway client for implicit local logs reads, and route Linux follow-mode local RPC failures to a bounded/redacted active systemd journal fallback instead of stale configured-file logs.
Fixes#83656Fixes#66841
Summary:
- The branch adds a config-aware tool auth helper, routes image/PDF/media generation preflight and list selection through it, threads `workspaceDir`, and adds focused regression tests plus a changelog entry.
- Reproducibility: yes. by source inspection. Current main gates affected media/PDF/generation preflight paths on env/profile auth while the runtime auth contract already accepts usable `models.providers.*.apiKey`.
Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(tools): fall back to config apiKey in capability preflight
- PR branch already contained follow-up commit before automerge: fix(tools): honor config apiKey in media tool preflight
- PR branch already contained follow-up commit before automerge: fix(clawsweeper): address review for automerge-openclaw-openclaw-8557…
Validation:
- ClawSweeper review passed for head b8c9242d77.
- Required merge gates passed before the squash merge.
Prepared head SHA: b8c9242d77
Review: https://github.com/openclaw/openclaw/pull/85570#issuecomment-4523770355
Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
Summary:
- Adds an optional archive-error callback for session transcript archiving, wires `/new` reset rotation to log previous-transcript archive failures, adds regression coverage, and updates the changelog.
- Reproducibility: yes. source-reproducible. Current main catches and ignores `archiveFileOnDisk` failures ins ... and the source PR proof exercises the same rename failure boundary with a real filesystem permission error.
Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.
Validation:
- ClawSweeper review passed for head 9d5f4c0c70.
- Required merge gates passed before the squash merge.
Prepared head SHA: 9d5f4c0c70
Review: https://github.com/openclaw/openclaw/pull/85586#issuecomment-4523917139
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: takhoffman
Co-authored-by: takhoffman <781889+takhoffman@users.noreply.github.com>