Commit Graph

863 Commits

Author SHA1 Message Date
Vincent Koc
d5d6576e06 fix(docs): refresh qa lab plugin inventory 2026-06-18 07:57:49 +02:00
Vincent Koc
f8f2006c8b fix(plugin-sdk): refresh api baseline hash 2026-06-18 06:46:44 +02:00
Vincent Koc
af026b383d fix(plugin-sdk): refresh api baseline hash 2026-06-18 05:34:19 +02:00
Vincent Koc
b35b1f2b7c fix(sdk): refresh plugin api baseline 2026-06-17 19:11:18 +02:00
Vincent Koc
1c0b38f960 fix(sdk): refresh plugin surface baselines 2026-06-17 12:25:42 +02:00
Vincent Koc
bed5bf339e fix(sdk): refresh plugin api baseline 2026-06-17 10:00:29 +02:00
Vincent Koc
922aea7d28 fix(sdk): refresh plugin api baseline 2026-06-17 05:47:07 +02:00
Vincent Koc
93216e1ca1 fix(sdk): refresh plugin api baseline hash 2026-06-17 03:07:56 +02:00
Vincent Koc
cfb27e6437 fix(ci): align plugin SDK surface budget 2026-06-17 07:28:26 +08:00
Shakker
920e6a8eec chore: set version 2026.6.9 2026-06-16 19:54:07 +01:00
Vincent Koc
617f97d4b9 fix(plugin-sdk): refresh API baseline hash 2026-06-16 18:34:45 +02:00
Vincent Koc
fa33f5bbb8 fix(plugin-sdk): refresh API baseline hash 2026-06-16 12:32:39 +02:00
Vincent Koc
983e0f2ba0 docs: refresh generated API baselines 2026-06-16 07:26:19 +02:00
Vincent Koc
1c2363def6 fix(plugin-sdk): refresh QA self-check API baseline 2026-06-16 02:56:41 +02:00
Vincent Koc
767e8280ac fix(cli): harden official plugin recovery (#93325)
* fix(cli): harden official plugin recovery

* fix(config): preserve include write context

* fix(config): reject external include mutations

* fix(config): bind snapshots to config paths

* fix(config): preserve write ownership

* fix(cli): preflight plugin config mutations

* chore(plugin-sdk): refresh api baseline

* test(config): prove install env policy mutations

* fix(cli): preflight plugin updates

* fix(cli): preflight non-npm id migrations

* chore(plugin-sdk): refresh api baseline

* fix(cli): satisfy plugin recovery checks
2026-06-15 23:07:29 +08:00
sandieman2
c67dc59b02 fix(reply): deliver final reply when queued follow-up claims session; scope dedupe to routed thread (#90943)
* fix(reply): deliver final reply when queued follow-up claims session; scope dedupe to routed thread

Two core bugs caused composed replies to be silently dropped (no delivery,
no error) when a second message arrived in the same thread mid-run:

1. dispatch-from-config: ensureDispatchReplyOperation only kept the
   dispatch-owned operation authoritative while it had no result. Once
   runReplyAgent completed the operation to drain queued follow-ups, a
   second same-thread inbound could claim the session and the first final
   reply would try to re-acquire the lane instead of finishing delivery,
   deadlocking behind the queued work. Keep the dispatch-owned operation
   authoritative through final delivery.

2. reply-payloads-dedupe: messaging-tool reply dedupe compared only the
   channel target, not the routed thread, so a send in one thread could
   suppress a later reply in a different thread. Thread the routed thread
   id through buildReplyPayloads + follow-up delivery and only fall back to
   channel-only matching for providers without a thread-aware suppression
   matcher when neither side carries thread evidence.

Adds regression tests; existing Telegram topic-suppression behavior is
preserved by gating the thread guard to providers lacking a plugin matcher.

* fix(reply): preserve threaded message delivery evidence

* fix(reply): dedupe final payloads by delivery route

* fix(slack): preserve native send thread evidence

* fix(reply): preserve explicit reply thread evidence

* fix(reply): align explicit reply route dedupe

* fix(reply): preserve delivery lane through final dispatch

* fix(mattermost): preserve threaded tool send routes

* chore(plugin-sdk): refresh API baseline

* fix(reply): align final delivery route dedupe

* fix(reply): gate followups on final delivery

* fix(reply): keep send receipts private

* fix(reply): infer implicit message provider

* fix(reply): align routed threading policy

* fix(reply): preserve queued delivery context

* fix(reply): hydrate queued system event routes

* fix(reply): hydrate queued execution routes

* fix(reply): scope final delivery barriers

* fix(slack): preserve DM target aliases

* fix(reply): mirror resolved source thread routes

* fix(mattermost): retain delayed delivery barrier

* fix(codex): separate message routing from tool policy

* fix(reply): consume normalized Slack DM targets once

* fix(slack): remove stale target alias

* style(reply): satisfy changed lint gates

* fix(mattermost): preserve explicit reply targets

* test: align Slack reply branch checks

* fix(reply): persist overflow summaries to admitted session

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-06-14 09:11:05 -07:00
Ayaan Zaidi
d498b1cce4 fix(plugin-sdk): expose delivery hints without utility imports 2026-06-14 18:18:20 +05:30
David
23d74dad12 fix(lmstudio): honor thinking off for binary reasoning models (#92002)
Scope disabled-thinking payload repair to LM Studio's lightweight provider stream hook. Preserve official OpenAI and Anthropic tool-calling paths.

Co-authored-by: David <32288+nxmxbbd@users.noreply.github.com>
2026-06-14 05:41:49 -07:00
brokemac79
d1299658ac fix(active-memory): preserve verbose recall summaries (#90739)
* fix(active-memory): preserve verbose recall summaries

* fix(active-memory): require recall evidence for recovery

* fix(active-memory): recognize capped recall results

* fix(active-memory): preserve grounded recall state

* refactor(active-memory): limit recovery to completed recalls

* fix(active-memory): ground terminal recall recovery

* fix(active-memory): limit unavailable recovery to completed replies

* fix(active-memory): harden recall evidence recovery

* fix(active-memory): preserve timeout recovery contract

* fix(active-memory): preserve capped failure evidence

* fix(active-memory): reject content-only recall failures

* fix(active-memory): ground completed recall summaries

* fix(active-memory): separate hook and recall timeouts

* fix(active-memory): classify custom tool failures

* fix(active-memory): preserve harness tool evidence

* fix(active-memory): reject explicit empty results

* fix(active-memory): wait for fallback recall evidence

* fix(codex): report dynamic tool results

* fix(active-memory): separate preflight recall deadline

* fix(active-memory): normalize recall tool names

* fix(agents): classify unavailable approvals

* docs(active-memory): clarify hook timeout phases

* test(active-memory): stabilize timeout abort proof

* fix(agents): preserve successful cancellation outcomes

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-06-13 23:38:58 -07:00
Jason (Json)
965fa05df3 feat: add tool search directory mode
Add an experimental directory mode that keeps large authorized tool schemas deferred while exposing bounded discovery, exact deferred hydration, and normal OpenClaw policy/hook execution. Client tools remain directly visible; ambiguous hidden names fail closed.
2026-06-13 20:08:39 -07:00
Josh Avant
689ebc815b feat: support /btw in CLI-backed sessions (#92669)
* feat: support CLI btw side questions

* test: fix CLI prepare test fixture types

* fix: lazy load local btw runner
2026-06-13 19:36:53 +02:00
Dallin Romney
4809ac70fa Add QA evidence artifact output (#91484)
* feat: add qa evidence summary normalization

* chore: rename qa evidence target environment

* chore: align qa evidence profile terminology

* chore: align qa evidence summary fields

* chore: add qa evidence taxonomy ref

* test: remove stale multipass evidence example

* test(qa): normalize vitest and playwright evidence

* test(qa): slim evidence summary metadata

* test(qa): clarify evidence summary inputs

* test(qa): rename scenario specs in evidence flow

* test(qa): treat evidence profiles as mapping strings

* test(qa): use neutral evidence test identity

* test(qa): nest evidence summary joins

* refactor(qa): normalize live evidence summaries

* fix(qa): accept normalized telegram rtt summaries

* fix(qa): normalize evidence lane summaries

* fix(qa): align evidence summaries with requirements

* refactor(qa): tighten evidence summary builders

* refactor(qa): restore standard evidence ids

* fix(qa): keep legacy summaries out of rtt evidence

* refactor(qa): make package evidence provenance explicit

* test(qa): keep script tests out of qa lab internals

* refactor(qa): rename scenario evidence definitions

* refactor(qa): clean evidence summary wording

* test(qa): fix evidence summary test inputs

* refactor(qa): simplify evidence identity fields

* refactor(qa): tighten evidence summary inputs

* refactor(qa): rename evidence artifact
2026-06-12 16:12:58 -07:00
Josh Avant
9921825e17 Fix Telegram spooled buffered replay (#92281)
* fix telegram spooled buffered replay

* fix telegram replay type checks

* fix telegram replay lint

* test telegram replay visible output retry guard

* fix telegram rollback failure retry
2026-06-12 11:51:46 -05:00
Peter Steinberger
0e7b5c3429 feat(anthropic): support Claude Fable 5 adaptive thinking (#91882)
* feat(anthropic): support Claude Fable 5

* test(anthropic): tighten Fable stream fixtures

* fix(anthropic): preserve Vertex input types

* test(anthropic): use provider-ready Vertex effort

* fix(anthropic): support Fable deployment aliases

* fix(anthropic): discard incomplete Fable output

* feat(anthropic): support Fable on Bedrock

* fix(anthropic): preserve Fable reasoning contracts

* refactor(anthropic): unify canonical Claude model policy

* fix(anthropic): satisfy extension thinking types

* test(anthropic): complete canonical alias fixture

* fix(bedrock): scope thinking case declarations
2026-06-10 08:08:35 -07:00
openclaw-clownfish[bot]
54c400a975 fix(plugin-sdk): refresh API baseline hash 2026-06-10 14:12:38 +09:00
Agustin Rivera
f0d8048aa3 fix(search): enforce native web search tool policy (#91750)
* fix(search): enforce native web search tool policy

* fix(search): apply session policy to native web search

* fix(search): gate direct OpenAI native search

* fix(search): redact native web search provider context
2026-06-09 16:25:15 -07:00
Alex Knight
bf95883812 feat(diagnostics-otel): capture tool input/output content via trusted channel (#91256)
diagnostics.otel.captureContent.{toolInputs,toolOutputs} were documented
and config-wired but never produced any span content. Emit tool args and
results over the trusted private-data diagnostic channel (mirroring the
model-content path), and have the OTel exporter bound/redact/truncate them
before span export. Raw tool content never rides the public event bus.

Scope: core embedded-runner tool path (canonical producer). Codex
(async-batched) and Claude CLI remain follow-ups tracked by the issue.

Refs #77391
2026-06-10 05:52:52 +10:00
openclaw-clownfish[bot]
e949809f6e chore(plugin-sdk): refresh API baseline hash (#91661)
Co-authored-by: openclaw-clownfish[bot] <280122609+openclaw-clownfish[bot]@users.noreply.github.com>
2026-06-09 18:20:35 +09:00
Ayaan Zaidi
2858c629bd build(plugin-sdk): refresh api baseline for cli commentary bridge 2026-06-08 18:06:18 +05:30
Omar Shahine
fc6400ede3 fix(imessage): always-on inbound recovery and dedupe (#91335)
* feat(imessage): always-on inbound recovery, deprecate catchup

Replaces the opt-in catchup subsystem with always-on inbound replay
protection that brings iMessage in line with the other channels, and
fixes #89237 (stale backlog dispatched as fresh after bridge recovery).

- New inbound-dedupe.ts: persistent claimable GUID dedupe (claim/commit/
  release) plus a stale-backlog age fence that suppresses live rows whose
  send date is materially older than arrival (logged, never silent).
- monitor-provider: claim at ingestion, carry the exact claimed key on the
  debouncer entry, commit on successful flush / release on dispatch failure
  (per-unit so a coalesced bucket cannot strand a sibling claim). Keeps the
  local startup since_rowid watermark so startup-window rows are not skipped.
- Deprecate catchup: delete catchup.ts + catchup-bridge.ts, remove the
  channels.imessage.catchup schema, cursor migration, and config-guard nag.
  Back-compat: strip the retired key before validation; new imessage doctor
  contract reports + removes it on doctor --fix.
- Docs updated for the new recovery model.

Net -947 prod LOC.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(imessage): recover downtime messages via since_rowid replay

Builds downtime recovery on the new inbound dedupe instead of restoring the
old catchup subsystem. On startup the monitor passes the last dispatched rowid
(a persisted per-account cursor) to imsg watch.subscribe as since_rowid, so imsg
replays the messages that landed while the gateway was down, then tails live.
The GUID dedupe drops anything already handled, so no cursor/retry bookkeeping
is needed.

- recovery-cursor.ts: minimal persisted per-account lastDispatchedRowid.
- monitor-provider: since_rowid = cursor (capped to the most recent
  IMESSAGE_RECOVERY_MAX_ROWS); split the age fence on the startup rowid boundary
  so replayed rows (<= boundary) use the wider recovery window and live rows
  (> boundary) keep the tight #89237 fence; advance the cursor on commit.
- Local only: remote SSH cliPath cannot read chat.db, so it tails from the
  current rowid (suppress-and-move-on) as before.

Restores missed-message recovery that the catchup removal dropped, with no
config and a fraction of the old LOC.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(imessage): make recovery cursor advance failure- and suppression-safe

Addresses two cursor-state regressions in the downtime-recovery path:

- Failed replay rows could be skipped forever: a released (failed) row keeps
  its dedupe claim for retry, but a later successful row in the same flush
  advanced the cursor past it, so the next startup's since_rowid skipped it.
  Hold a per-session floor at the lowest released rowid and never advance the
  cursor past it.
- Suppressed live backlog could be re-delivered after a restart: a live row
  suppressed under the tight live fence was not recorded, so after a restart it
  fell under the wider recovery window (its rowid now below the new boundary)
  and was delivered. Commit its dedupe key on suppression so the recovery
  replay treats it as already handled.

Both caught by Codex autoreview. Adds regression tests for the floor and the
suppression record.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(imessage): bound the GUID-less replay key length

Hash the composite fallback key's variable parts (conversation, sender,
created_at, text) so the key is length-bounded regardless of message text.
The persistent dedupe store already hashes keys internally, so this was not a
live overflow, but the bounded key removes the dependency on that and keeps the
fallback fail-open. Flagged by autoreview.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(imessage): recover downtime messages on remote cliPath setups too

The since_rowid replay runs over the imsg RPC client, so driving it from the
persisted recovery cursor (not the local chat.db boundary) makes downtime
recovery work for remote SSH cliPath gateways — the topology the old RPC-based
catchup served and that the rowid-boundary-only version regressed. Local setups
keep the wider, capped recovery window via the chat.db boundary; remote uses the
live age-fence window. Flagged by autoreview.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(imessage): seed recovery cursor from retired catchup cursor on upgrade

A one-time, self-cleaning migration: when the recovery cursor is empty on the
first startup after upgrade, seed it from the retired imessage.catchup-cursors
lastSeenRowid and consume the legacy entry. Without this a user who had catchup
enabled would not replay messages missed across the upgrade restart. Flagged by
autoreview.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(imessage): preserve catchup recovery on upgrade

---------

Co-authored-by: Omar Shahine <10343873+omarshahine@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-06-08 16:54:10 +09:00
Peter Steinberger
538d36eaaa refactor: move session metadata to SQLite (#91322)
* refactor: move session metadata to sqlite

* test: seed session stores with sqlite fixtures

* test: seed remaining session stores with sqlite fixtures

* fix: stabilize sqlite session cache freshness

* test: seed cli transcript metadata in sqlite
2026-06-07 23:17:35 -07:00
Jason (Json)
57e0bdaabe feat: add live provider model catalog helper
Summary:
- Add a shared live provider catalog runtime for SDK-backed providers.
- Route OpenAI, xAI, OpenCode Go, Chutes, DeepInfra, Venice, NVIDIA, and Vercel AI Gateway live model discovery through the shared helper.
- Remove duplicated provider-local live catalog caching and harden auth marker stripping, empty live-result retries, and OpenAI custom-base-url handling.

Verification:
- node scripts/run-vitest.mjs extensions/openai/openai-provider.test.ts src/plugin-sdk/provider-catalog-live-runtime.test.ts src/commands/models/list.source-plan.test.ts extensions/opencode-go/index.test.ts extensions/nvidia/provider-catalog.test.ts
- pnpm plugin-sdk:api:check
- pnpm lint --threads=8
- pnpm run lint:extensions:bundled
- pnpm run test:extensions:package-boundary:compile
- pnpm check:import-cycles
- pnpm exec oxfmt --check extensions/openai/openai-provider.ts extensions/openai/openai-provider.test.ts
- git diff --check origin/main...HEAD
- autoreview clean: no accepted/actionable findings reported
- AWS Crabbox focused remote proof: run_364680d1bff8 / cbx_2456fffafe01
- Earlier same-PR AWS Crabbox live proof: run_1f05ccab368e / cbx_7375c79fcf9b

Known proof gap:
- Final current-code true live-provider smoke was blocked by Crabbox secret hydration, documented in the PR proof comment.
2026-06-07 14:16:00 -07:00
Peter Steinberger
08ae0e6d29 refactor: store Zalo hosted media in plugin state
Move Zalo hosted outbound media metadata and expiry into plugin state, add SDK chunked hosted media storage, and keep CI/type/lint gates green after rebase.
2026-06-06 22:56:48 -07:00
wsyjh8
a1f1895b1b fix(config): allow thinkingLevelMap in persisted model schema
Allow persisted provider model entries to carry strict thinkingLevelMap values so Microsoft Foundry Entra onboarding can save generated reasoning model config. Closes #91011.
2026-06-06 20:44:15 -07:00
Vincent Koc
13078d24ab chore(release): refresh plugin sdk api baseline 2026-06-04 20:50:17 -07:00
Onur Solmaz
0dbf17471b feat(memory): support qmd query rerank toggle
Add memory.qmd.rerank as an opt-out for QMD query reranking when searchMode is query.

When set to false, direct QMD query calls pass --no-rerank and the mcporter unified query tool receives rerank:false. Search and vsearch modes keep their existing behavior.

Refs #61834.
2026-06-05 11:18:57 +08:00
Peter Steinberger
1878ca0820 chore(release): prepare 2026.6.2 beta 2026-06-04 00:06:52 +01:00
Peter Steinberger
e254346bc2 chore(release): prepare 2026.6.3 beta 2026-06-03 23:42:34 +01:00
Vincent Koc
2b31ad2ee5 docs(plugin-sdk): refresh API baseline hash 2026-06-03 14:48:00 -07:00
Vincent Koc
0f05aff312 docs(config): refresh channel config baseline hash 2026-06-03 14:27:38 -07:00
Ayaan Zaidi
a4b09d72b9 refactor(channels): share progress draft compositor 2026-06-03 10:54:19 +05:30
Bek
bce3d5bf92 trace: Correlate channel diagnostics into one trace
Correlates channel receive, agent lifecycle, model attempt diagnostics, and outbound delivery diagnostics into one trace waterfall so channel message runs can be inspected end-to-end.

Maintainer follow-up removed the internal `AgentHarnessV2` adapter surface and kept the harness path canonical through `src/agents/harness/lifecycle.ts`.

Proof:
- PR checks passed on `04e9189c15480d53663d533a04c9883164b4dd54`.
- `node scripts/run-vitest.mjs src/agents/harness/lifecycle.test.ts src/agents/harness/selection.test.ts src/channels/turn/kernel.test.ts`
- `pnpm check:changed` Testbox `tbx_01kt3xtrm70qc7nb90cqv5rah1`

Thanks @bek91.

Co-authored-by: Bek <bek.akhmedov@gmail.com>
2026-06-02 06:38:00 -04:00
NianJiu
5a55135146 fix(memory): retry transient FileProvider-backed reads (#85351) 2026-06-01 12:40:20 -07:00
Vincent Koc
d10d71cdb6 fix(codex): stabilize app-server cleanup tests 2026-06-01 13:15:05 +02:00
Peter Steinberger
1d4c1ba56d fix: harden memory envelope sanitization
Co-authored-by: amittell <mittell@me.com>
2026-06-01 09:30:08 +01:00
Peter Steinberger
6173a4babb docs(plugin-sdk): refresh API baseline 2026-06-01 06:29:51 +01:00
Peter Steinberger
d925249ac0 docs(plugin-sdk): refresh API baseline hash 2026-06-01 06:05:37 +01:00
Peter Steinberger
3b802a7fbc docs(plugin-sdk): refresh API baseline hash 2026-06-01 04:59:39 +01:00
Peter Steinberger
592b6e2916 docs(config): refresh config baseline hash 2026-06-01 04:20:57 +01:00
Peter Steinberger
f879e3d6a0 docs(plugin-sdk): refresh API baseline hash 2026-06-01 04:01:25 +01:00