Commit Graph

4590 Commits

Author SHA1 Message Date
Vincent Koc
a231ab8acf fix(e2e): resolve macOS Parallels VM 2026-06-15 08:27:48 +08:00
Vincent Koc
a7e0822a1a fix(e2e): resume restored Parallels snapshots 2026-06-15 08:05:39 +08:00
Vincent Koc
47ec5be9ef fix(docker): seed prune store from lockfile 2026-06-15 07:35:19 +08:00
Ayaan Zaidi
d498b1cce4 fix(plugin-sdk): expose delivery hints without utility imports 2026-06-14 18:18:20 +05:30
Dallin Romney
1affe4fcdf Fold Telegram RTT sampling into live QA evidence (#92550)
* refactor(qa): fold telegram rtt into live evidence

* test: default package telegram rtt samples

* refactor(qa-lab): fold telegram rtt into live evidence

* fix(qa-lab): keep package telegram rtt optional for focused runs

* fix(qa-lab): avoid stale rtt evidence on failed samples

* fix(qa-lab): pass telegram live env into credential leasing

* fix(qa-lab): update telegram canary remediation artifacts

* docs(qa): remove stale telegram observed artifact guidance

* fix(qa-lab): clarify telegram empty-reply remediation

* fix(qa-lab): honor telegram rtt timeout

* ci(qa): drop stale telegram capture env

* refactor: align telegram evidence coverage fields

* fix: ignore stale telegram observed artifacts

* fix: preserve telegram rtt coverage mapping

* fix: omit unused telegram rtt catch binding

* docs: document telegram rtt check selector
2026-06-14 17:02:33 +08:00
Peter Steinberger
19130e0dc2 fix(providers): quarantine unreadable Anthropic payload tools (#92908)
Quarantine unreadable and invalid Anthropic-family tool schemas before OpenAI-compatible serialization, keep tool choices aligned with surviving tools, and preserve provider metadata.

Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-06-14 00:27:48 -07:00
Dallin Romney
a3e9dfee0e Simplify QA scorecard mapping shape (#92558)
* test(qa): simplify scorecard mapping shape

* test(qa): use typed scorecard evidence refs

* test(qa): map scorecard categories by coverage id

* feat: align qa coverage with taxonomy features

* refactor: keep qa coverage ids canonical

* refactor: minimize qa coverage id churn

* test: align qa coverage id assertions

* test: update qa evidence coverage expectations

* refactor qa taxonomy coverage ids

* style qa taxonomy coverage ids

* test qa coverage lint fix

* test qa coverage type fix
2026-06-14 00:16:33 -07:00
Jason (Json)
8ae1adfdcc ci: gate stable releases on Windows companion assets (#92555)
* ci: gate stable releases on Windows companion assets

* fix(release): reject malformed Windows checksum manifests

* fix(release): make Windows recovery fail closed

* fix(release): tighten Windows asset identity checks

* fix(release): validate prepared candidate tarballs

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-06-13 19:33:33 -07:00
Vincent Koc
e4313bac97 fix(ui): localize workspace file rail labels 2026-06-14 06:01:30 +08:00
Peter Steinberger
735f59af73 feat(providers): add GLM-5.2 support (#92796)
* feat(providers): add GLM-5.2 support

* ci(live): add GLM-5.2 provider shard
2026-06-13 14:33:28 -07:00
Ayaan Zaidi
e8b142feb1 refactor(telegram): remove native draft previews 2026-06-13 21:45:22 +05:30
Vincent Koc
45056a463a fix(test): extend watchdog for gateway core shard 2026-06-13 23:01:11 +08:00
Vincent Koc
27e24ca683 fix(test): extend watchdog for slow vitest shards 2026-06-13 21:37:57 +08:00
Mason Huang
eaeedbf1f9 fix(docs): finalize i18n postprocess before skip (#92668)
Summary:
- Merged fix(docs): finalize i18n postprocess before skip after ClawSweeper review.

Automerge notes:
- No ClawSweeper repair was needed after automerge opt-in.

Validation:
- ClawSweeper review passed for head ad79445835.
- Required merge gates passed before the squash merge.

Prepared head SHA: ad79445835
Review: https://github.com/openclaw/openclaw/pull/92668#issuecomment-4698629026

Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
2026-06-13 13:17:03 +00:00
Vincent Koc
73aabcceda fix(test): split local full-suite shards when throttled 2026-06-13 19:36:35 +08:00
BunsDev
7bd533a80e Revert "chore(maint): make PR changelog edits release-only (#92607)"
This reverts commit 4640baa299.
2026-06-13 03:10:10 -05:00
Val Alexander
4640baa299 chore(maint): make PR changelog edits release-only (#92607) 2026-06-13 02:57:08 -05:00
Patrick Erichsen
6cf06e8e7e ci: split plugin ClawHub publishing paths
* feat: partition clawhub plugin release candidates

* fix: read clawhub trusted publisher config endpoint

* feat: split clawhub plugin bootstrap workflow

* ci: split plugin clawhub publish paths

* ci: pin clawhub package publish workflow

* ci: keep clawhub bootstrap token out of builds

* ci: fix clawhub release dry-run gating

* ci: align clawhub oidc publish refs

* ci: make clawhub bootstrap recovery idempotent

* ci: route clawhub repair candidates through bootstrap

* ci: preserve tideclaw alpha clawhub guards

* ci: simplify clawhub release ref handling

* ci: extract clawhub release routing plan

* ci: extract clawhub release runtime state

* test: guard clawhub release helper executability

* ci: pin ClawHub CLI for plugin publishing

* ci: allow historical ClawHub dry-run validation

* ci: fix ClawHub bootstrap token handoff
2026-06-12 20:16:06 -07:00
Dallin Romney
ded3a93058 fix(e2e): keep lifecycle timeout cleanup alive (#92566) 2026-06-12 18:52:34 -07:00
Peter Steinberger
8c7e5c6918 feat(moonshot): add Kimi K2.7 Code support (#92554)
* feat(moonshot): add Kimi K2.7 Code support

* test(moonshot): surface K2.7 live provider errors

* ci(live): accept Kimi key for Moonshot sweeps

* test(moonshot): verify K2.7 across API regions
2026-06-12 17:37:28 -07:00
Dallin Romney
4809ac70fa Add QA evidence artifact output (#91484)
* feat: add qa evidence summary normalization

* chore: rename qa evidence target environment

* chore: align qa evidence profile terminology

* chore: align qa evidence summary fields

* chore: add qa evidence taxonomy ref

* test: remove stale multipass evidence example

* test(qa): normalize vitest and playwright evidence

* test(qa): slim evidence summary metadata

* test(qa): clarify evidence summary inputs

* test(qa): rename scenario specs in evidence flow

* test(qa): treat evidence profiles as mapping strings

* test(qa): use neutral evidence test identity

* test(qa): nest evidence summary joins

* refactor(qa): normalize live evidence summaries

* fix(qa): accept normalized telegram rtt summaries

* fix(qa): normalize evidence lane summaries

* fix(qa): align evidence summaries with requirements

* refactor(qa): tighten evidence summary builders

* refactor(qa): restore standard evidence ids

* fix(qa): keep legacy summaries out of rtt evidence

* refactor(qa): make package evidence provenance explicit

* test(qa): keep script tests out of qa lab internals

* refactor(qa): rename scenario evidence definitions

* refactor(qa): clean evidence summary wording

* test(qa): fix evidence summary test inputs

* refactor(qa): simplify evidence identity fields

* refactor(qa): tighten evidence summary inputs

* refactor(qa): rename evidence artifact
2026-06-12 16:12:58 -07:00
Josh Avant
f3eb8e9714 Fix OTLP log trace correlation (#92276)
* fix diagnostics otel log trace correlation

* test diagnostics trace provenance contract
2026-06-12 10:54:21 -05:00
Shakker
81c553e2fb fix: stop docker build commands by pid and group 2026-06-12 15:16:00 +01:00
Galin Iliev
301213a05f test(sqlite): add state perf query plan harness
Adds a SQLite state query-plan regression test and smoke benchmark, wires the smoke artifact into source performance evidence, validates SQLite smoke output in the performance summary, and removes a retired ClawHub nav entry that broke docs link checks.

Fixes #91616
2026-06-11 14:49:26 -07:00
Patrick Erichsen
9827490f5f fix: rely on ClawHub plugin publish checks 2026-06-11 11:51:57 -07:00
mushuiyu_xydt
777f7409d8 fix(installer): stop after failed Node package installs
Linux Node package-manager setup/install failures now fail the installer immediately instead of falling through to a misleading success path. Adds regression coverage for NodeSource setup and apt nodejs install failures under conditional shell invocation.\n\nFixes #73837\n\nProof: bash -n scripts/install.sh; node scripts/run-vitest.mjs test/scripts/install-sh.test.ts; node scripts/run-oxlint.mjs test/scripts/install-sh.test.ts; git diff --check origin/main...HEAD; autoreview clean; Azure Crabbox check:changed cbx_6286dc1e287b passed.
2026-06-11 22:58:43 +09:00
clawsweeper[bot]
2bec2caf0c fix(channel): harden local setup trust (#92175)
Summary:
- The PR extends channel setup trust enforcement and trusted catalog fallback from workspace-origin plugins to ... nfigured load paths into catalog discovery, and adds focused regression plus Docker/package proof coverage.
- PR surface: Source +190, Tests +892, Other +324. Total +1406 across 13 files.
- Reproducibility: yes. The source PR provides a concrete clean-main Docker/package path where an explicitly t ... ns unresolved, while the patched package resolves it and still blocks untrusted module and setup execution.

Automerge notes:
- PR branch already contained follow-up commit before automerge: fix(channel): stabilize trusted catalog dts typing
- PR branch already contained follow-up commit before automerge: fix(channel): repair trusted catalog exclusions typing
- PR branch already contained follow-up commit before automerge: test(channel): cover local channel plugin trust
- PR branch already contained follow-up commit before automerge: chore(deps): refresh plugin shrinkwraps
- PR branch already contained follow-up commit before automerge: test(channel): route trust regression in command shard
- PR branch already contained follow-up commit before automerge: test(channel): remove e2e-named trust regression

Validation:
- ClawSweeper review passed for head eabee04d54.
- Required merge gates passed before the squash merge.

Prepared head SHA: eabee04d54
Review: https://github.com/openclaw/openclaw/pull/92175#issuecomment-4680798117

Co-authored-by: Mason Huang <masonxhuang@tencent.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: clawsweeper <274271284+clawsweeper[bot]@users.noreply.github.com>
Co-authored-by: clawsweeper[bot] <274271284+clawsweeper[bot]@users.noreply.github.com>
Approved-by: hxy91819
Co-authored-by: hxy91819 <8814856+hxy91819@users.noreply.github.com>
2026-06-11 13:48:41 +00:00
Shakker
f7ee25291a chore: remove redundant proof scripts 2026-06-11 14:29:12 +01:00
Vincent Koc
6fb0c940fa fix(release): gate beta publish on plugin verification
Delay public GitHub release publication until postpublish verification, dependency evidence upload, proof append, and required plugin publish gates pass.

Also updates release-maintainer instructions so newly publishable plugins are minted/prepublished through an owner-approved path without consuming the next auto-bumped beta version unless that path is the actual release publish.
2026-06-11 20:42:58 +09:00
Lu Wang
43b4e27699 fix(thinking): apply Claude profile to anthropic-messages catalog rows (#92053)
* fix(thinking): apply Claude profile to anthropic-messages catalog rows

When a custom provider (e.g. `jdcloud-anthropic`) fronted Claude Opus over
the native anthropic-messages adapter, `--thinking xhigh` was silently
clamped to `off`. The thinking-profile dispatcher resolves bundled plugin
policy surfaces by exact provider id, so a renamed Anthropic-compatible
provider never reached the anthropic plugin's policy and `xhigh` was not
in the resulting profile.

`auto-reply/thinking.ts` already had a fallback keyed on
`context.api === "anthropic-messages"` that attached
`CLAUDE_FABLE_5_THINKING_PROFILE` for Fable models. Generalize it to use
`resolveClaudeThinkingProfile(modelId, params)` instead — the same
canonical helper the anthropic plugin uses — which still returns the Fable
profile for Fable models and now returns the correct Opus 4.7/4.8 profile
(with `xhigh`/`adaptive`/`max`) for Claude Opus regardless of provider id.

Non-Claude models on anthropic-messages routes still get the base
profile, and a Claude id on a non-Anthropic transport (e.g. an
openai-completions catalog row) is unaffected.

Fixes #91975

* fix(thinking): match native Anthropic includeNativeMax in fallback

Address ClawSweeper P2 review on #92053. The anthropic-messages fallback
in `resolveThinkingProfile` calls `resolveClaudeThinkingProfile` but
omits the `{ includeNativeMax: true }` option that the bundled anthropic
plugin uses (extensions/anthropic/provider-policy-api.ts:38,45).

For native-xhigh Claude families (Opus 4.7/4.8) this had no effect since
the native-xhigh branch already exposes `max`. But adaptive Claude
families that take the adaptive-default branch (e.g. claude-sonnet-4-6,
claude-opus-4-6) silently lost `max` parity on custom anthropic-messages
providers compared to native Anthropic policy.

Also add a regression test on `claude-sonnet-4-6` that verifies the
adaptive-branch path keeps `max` for custom providers.

* docs(thinking): document deliberate compat.xhigh bypass on anthropic-messages

Self-review surfaced a subtle behavior change worth documenting: when the
anthropic-messages fallback was generalized, non-Claude models on this
transport stop honoring catalog `compat.supportedReasoningEfforts: ["xhigh"]`
because they take the Claude base profile instead of falling through to the
later `catalogSupportsXHigh` upgrade path.

This is intentional — anthropic-messages does not carry a generic xhigh
contract; xhigh on this protocol is a Claude-family capability. Add an
inline comment at the resolver site and a regression test that locks the
suppression so the next reader (or a future patch) doesn't accidentally
restore the upgrade path.

* fix(thinking): extract Claude profile to leaf to break import cycle

The previous commits added a `resolveClaudeThinkingProfile` import from
`auto-reply/thinking.ts` to `plugin-sdk/provider-model-shared.ts`. The
shared barrel re-exports `provider-replay-helpers` and `plugins/types`,
which transitively reach back into `auto-reply` via the gateway server
methods chain — creating the madge cycle reported by
`check:madge-import-cycles`:

    auto-reply/thinking.ts
      -> ... -> plugin-sdk/provider-model-shared.ts
      -> plugins/{config-schema, host-hooks, ...} -> plugins/types.ts

Move `BASE_CLAUDE_THINKING_LEVELS`, `isClaudeAdaptiveThinkingDefaultModelId`,
and `resolveClaudeThinkingProfile` to a new leaf module
`src/plugins/provider-claude-thinking.ts` whose only imports are
`@openclaw/llm-core` and the existing leaf `provider-thinking.types`.

`provider-model-shared.ts` continues to re-export both helpers so existing
consumers (`extensions/anthropic/*`, the public test surface) are
unaffected. `auto-reply/thinking.ts` now imports the leaf directly,
breaking the cycle.

* test(thinking): add live proof harness for #91975 anthropic-messages clamp

---------

Co-authored-by: wanglu241 <wanglu241@jd.com>
2026-06-11 19:24:46 +09:00
Lu Wang
43b1088962 fix(cli-runner): scope claude-cli queue to live-session owner identity (#91946) (#91974)
* fix(cli-runner): scope claude-cli queue to live-session owner identity

Fresh claude-cli runs without a stored cliSessionId previously collapsed
onto a single workspace-scoped queue key, serializing all fan-out within
one workspace regardless of subagent lane configuration.

Replace the workspace fallback with the same owner identity that
claude-live-session.ts already uses for its live-session map
(agentAccountId + agentId + authProfileId + sessionId + sessionKey),
keeping per-session resume safety while letting independent OpenClaw
sessions in the same workspace run concurrently.

Refactor buildClaudeLiveKey() to share the new buildClaudeOwnerKey()
helper so the queue key and the live-session key cannot drift.

Refs: #91946

* test(cli-runner): pin owner-key hash + document buildClaudeOwnerKey contract

Add a golden-hash regression test for buildClaudeOwnerKey using the
exact legacy fixture, so a future refactor that reorders fields or
flips the JSON encoding can't silently orphan every deployed Claude
live session at upgrade. Hash verified empirically against the prior
inline sha256(JSON.stringify(...)) in buildClaudeLiveKey.

Add a JSDoc on buildClaudeOwnerKey explaining the cross-module contract
between the CLI run queue and the live-session map.

Refs: #91946

* docs(cli-runner): tighten buildClaudeOwnerKey contract comment

The previous comment claimed an encoding mismatch would orphan deployed
live sessions across upgrades. The Claude live-session registry is
process-local, so any restart already discards every entry — the real
invariant is that the queue path and live-session path produce
byte-identical owner keys *within a single process*, so a fresh queued
turn picks up the same live session the registry already holds. Update
the helper docstring and the golden-hash test description accordingly;
the pinned hash and behavior are unchanged.

* test(cli-runner): add owner-key concurrency demo script

A pure-Node, no-test-runner demo that reproduces the PR-head queue
behavior end-to-end: BEFORE-PR collapse (workspace lane), distinct-owner
overlap, and identical-owner serialization, all in one run with
millisecond-stamped event ordering. Useful as a low-overhead regression
check for the owner-key contract and as a maintainer-runnable proof
artifact for #91946.

* test(cli-runner): satisfy oxlint curly + no-promise-executor-return

Wrap single-statement if/for-of bodies in braces and rewrite the
sleep helper so its Promise executor is a void block instead of an
arrow with an implicit return. No behavior change; demo output and
the byte-equivalent slice fingerprints are unchanged.

---------

Co-authored-by: wanglu241 <wanglu241@jd.com>
2026-06-11 19:24:39 +09:00
Vincent Koc
f1401b2cac perf(ci): isolate Docker tooling tests 2026-06-11 18:13:36 +09:00
Dallin Romney
c7ed990769 fix: preserve non-oneOf schema array order (#91891) 2026-06-10 15:58:30 -07:00
Shakker
a450ff036a fix: repair origin main CI failures 2026-06-10 23:49:41 +01:00
Vincent Koc
a7b0d325af fix(sessions): repair shipped stale transcript paths 2026-06-11 07:30:34 +09:00
Vincent Koc
0923ee251e fix(sessions): rewrite migrated transcript paths 2026-06-11 06:41:37 +09:00
Vincent Koc
2d404f1b86 fix(release): align survivor session migration assertion 2026-06-11 05:57:55 +09:00
anagnorisis2peripeteia
f1f00cbf1d fix: capture cron wake origin session
Capture the originating sessionKey and agentId for cron wake tool calls so non-main session and multi-agent wakes return to the conversation lane that requested them.

Carry stored delivery context through queued wake events so topic/thread replies route correctly, while preserving the default no-origin wake behavior and explicit target:none opt-out.

Refs #46886.
Refs #64556.
Thanks @anagnorisis2peripeteia.

Co-authored-by: Cameron Beeley <cameron.beeley@gmail.com>
2026-06-10 20:52:40 +01:00
Vincent Koc
e6b0a22f36 test(update): align corrupt plugin repair guidance 2026-06-11 04:33:25 +09:00
Shakker
d7f211b2b4 test: clarify usage cache refresh proof 2026-06-10 19:25:35 +01:00
Edward Abrams
e6e27c354c perf(usage-cost-cache): throttle full-cache rewrites during refresh
refreshCostUsageCache rewrote the entire .usage-cost-cache.json after every
single scanned session. With ~8190 stale session files and a 108MB cache, that
was O(N * cacheSize) work — sustained CPU burn from repeated 100MB+
JSON.stringify + atomic-replace cycles every refresh.

Checkpoint policy now batches durable writes:
  - At most one rewrite per 256 scanned files
  - Or one rewrite per 5s of wall time
  - Final write only when something actually changed (no-op refresh on a
    fully-fresh cache no longer rewrites the file)

Crash safety is preserved: an interrupted refresh still has a recent
checkpoint on disk, and the next run rescans only the unfinished tail
(file size + mtime + pricingFingerprint match).

Validation:
  - pnpm vitest run src/infra/session-cost-usage.test.ts (39/39 pass)
  - New test 'throttles cache writes during a large stale refresh' confirms
    cache renames stay below sessionCount/4 (was ~sessionCount+1) and that
    a no-op refresh issues zero cache writes.
  - pnpm check:changed (clean)

Beads: openclaw-0zr
2026-06-10 19:25:35 +01:00
Vincent Koc
fbb1eab4c7 fix(release): raise package scan headroom 2026-06-11 03:07:13 +09:00
Vincent Koc
43bbde4830 fix(build): respect PATH-less pnpm environments 2026-06-10 18:06:31 +09:00
Vincent Koc
3b13d6ae38 fix(build): fall back to Corepack for pnpm 2026-06-10 18:03:15 +09:00
Vincent Koc
0948bd648a test(e2e): widen kitchen sink RPC coverage 2026-06-10 16:42:47 +09:00
Vincent Koc
7f1d82ab25 revert(sessions): defer session metadata sqlite
Reverts 538d36eaaa while preserving subsequent main changes. The beta-only SQLite downgrade rescue and reverse migration remain excluded.
2026-06-10 16:34:06 +09:00
Vincent Koc
a3d5e5bc72 fix(test): support macOS Bash 3 script suites 2026-06-10 15:37:15 +09:00
brokemac79
de4b8d8ebf feat(plugins): allow installed trusted policy contracts
Allow explicitly enabled installed plugins to register declared trusted tool policies and agent tool result middleware, with trusted policy ids scoped by plugin owner.\n\nVerification covered targeted plugin/agent tests, typecheck, build, lint, local autoreview, and a Blacksmith Testbox runtime proof (tbx_01ktr1nq0rhq47fjkwrepm7fd3).
2026-06-10 16:18:23 +10:00
Vincent Koc
52bc2a12bc fix(ci): disable memory slot in release smoke config 2026-06-10 14:56:21 +09:00
Vincent Koc
b4cdd92119 fix(codex): avoid guardian review for local models (#88630)
* fix(codex): avoid guardian review for local models

* fix(codex): route app-server auto exec review

* fix(codex): make guardian requirements provider-aware

* fix(codex): block unrouted bound approvals

* fix(channels): satisfy ingress queue lint

* fix(codex): use local-model policy for side forks

* fix(extensions): satisfy ingress lint

* fix(codex): require trusted exec reviewer model

* fix(exec): share control command approval guards

* fix(codex): fail closed for unknown guardian model provider

* fix(codex): reject custom exec reviewer endpoints

* fix(codex): preserve bound providers on app-server reuse

* fix(codex): prefer qualified app-server model providers

* fix(codex): preserve guardian on model control switches

* fix(codex): retain local providers across model switches

* fix(codex): distrust aliased reviewer model refs

* fix(codex): preserve providers after thread rotation

* fix(codex): clear stale providers on qualified model switches

* fix(codex): prefer qualified models over legacy providers

* fix(codex): validate reviewer trust before auto approvals

* fix(codex): recompute reviewer policy after binding rotation

* fix(codex): normalize reviewer aliases before trust checks

* fix(codex): retain bound providers for slashed local models

* fix(codex): normalize provider trust checks for exec review

* fix(codex): ignore stale bindings for explicit providers

* fix(codex): share trusted reviewer endpoint policy

* fix(codex): keep network approvals on plugin path

* fix(codex): route provider-qualified model refs

* fix(codex): reject blank masked OpenAI base overrides

* fix(codex): scope exec reviewer alias trust

* fix(codex): distrust exec reviewer transport overrides
2026-06-09 21:38:22 -07:00