Vincent Koc
c6aa8e7423
refactor(qa): remove unused parity helpers
2026-06-18 14:55:46 +08:00
Vincent Koc
d5d6576e06
fix(docs): refresh qa lab plugin inventory
2026-06-18 07:57:49 +02:00
Dallin Romney
e17d111990
fix: require all taxonomy coverage ids ( #94296 )
2026-06-17 16:38:14 -07:00
Colin Johnson
591313e80a
qa-lab: support script-backed evidence scenarios ( #94276 )
...
* qa: add script scenario execution kind
* fix(qa-lab): carry suite profile into script producer evidence and simplify artifact path resolution
* fix(qa-lab): keep out-of-repo producer artifacts absolute to avoid ../ traversal refs
---------
Co-authored-by: Dallin Romney <dallinromney@gmail.com >
2026-06-17 15:09:25 -07:00
Vincent Koc
8288b4d4c9
fix(qa-lab): stabilize web search parity fixture
2026-06-18 00:07:36 +02:00
Dallin Romney
f7e5132ffd
test: fold gateway smoke into qa e2e ( #93178 )
2026-06-17 14:55:28 -07:00
Dallin Romney
fae4a01d0d
test: fold otel smoke into qa e2e ( #93181 )
...
* test: fold otel smoke into qa e2e
* test: eliminate otel smoke script
2026-06-17 14:54:58 -07:00
Dallin Romney
0a6736af09
test: fold lifecycle and package proof into QA Lab ( #93114 )
...
* test: fold script coverage into qa scenarios
* test: migrate script checks into qa e2e
* test: point qa code refs at migrated e2e
* test: fold plugin lifecycle probe into qa e2e
* test: use shared temp dirs in plugin lifecycle probe
* test: fold plugin lifecycle sweep into qa lab
* test: trim lifecycle docker text assertions
* test: keep followup script conversions split
* test: make lifecycle docker runner script-safe
* test: update changed helper routing expectation
2026-06-17 14:22:04 -07:00
Vincent Koc
5061a7a741
fix(qa-lab): kill qa cli process groups
2026-06-17 21:59:53 +02:00
Vincent Koc
aa3ed8f7ac
test(qa-lab): use temp dir harness in model catalog test
2026-06-17 21:57:00 +02:00
Vincent Koc
d988851fe0
fix(qa-lab): wait for model catalog process groups
2026-06-17 21:42:00 +02:00
Vincent Koc
aa498cfe11
fix(qa-lab): wait for gateway child process groups
2026-06-17 18:51:49 +02:00
Vincent Koc
1ee2733b2f
fix(qa-lab): harden live cleanup and readiness
2026-06-17 18:40:21 +02:00
Vincent Koc
2195b446d4
test(qa): align Docker inspect expectations
2026-06-17 15:03:49 +02:00
Vincent Koc
c05acc7a14
fix(testing): bind QA docker port probes to loopback
2026-06-17 14:18:03 +02:00
Vincent Koc
e2b6753b87
fix(qa-lab): bound credential payload reads
2026-06-17 11:59:55 +02:00
Vincent Koc
6774e7f259
chore(release): sync main to 2026.6.8
2026-06-17 07:25:30 +08:00
Shakker
920e6a8eec
chore: set version 2026.6.9
2026-06-16 19:54:07 +01:00
Vincent Koc
0c657190ec
fix(qa): fail runtime parity on cell failures
2026-06-16 05:53:19 +02:00
Vincent Koc
ffb67d2d2e
fix(qa): suppress empty WhatsApp debug artifacts
...
Suppress empty WhatsApp gateway-debug artifact publication and keep the public QA run view redacted and consistent across report/evidence output.
Verification:
- Testbox focused WhatsApp QA runtime format/lint/test run passed: https://github.com/openclaw/openclaw/actions/runs/27589031659
- Testbox changed gate passed: https://github.com/openclaw/openclaw/actions/runs/27589128132
- PR CI passed on final head: https://github.com/openclaw/openclaw/actions/runs/27589903708
- git diff --check passed locally
2026-06-16 10:32:53 +08:00
Vincent Koc
3c65127827
fix(qa): preserve WhatsApp live failure diagnostics
2026-06-16 03:42:26 +02:00
Dallin Romney
450060d7a2
test(qa): expand smoke-ci and release categories and coverage ( #93175 )
...
* test(qa): add smoke ci primary coverage evidence
* test(qa): remove overstated primary coverage claims
* test(qa): make release profile include smoke ci
* test(qa): trim taxonomy formatting churn
* test(qa): avoid hardcoded profile names in coverage test
* test(qa): make release profile cover taxonomy
* test(qa): type profile fixture all category flag
* test(qa): include channel delivery in smoke ci profile
2026-06-15 18:05:52 -07:00
Vincent Koc
c32ba171db
fix(qa): fail unsuccessful self-checks
2026-06-16 01:32:47 +02:00
Dallin Romney
e32929e12c
Add slim evidence mode for QA profile evidence ( #93179 )
...
* test(qa): compact profile evidence execution metadata
* docs(qa): document compact profile evidence
* test(qa): support compact evidence mode
* test(qa): rename compact evidence mode to slim
* docs(qa): trim slim evidence wording
* fix(qa): avoid commander runtime import
2026-06-15 14:50:40 -07:00
Dallin Romney
1f8c4d3958
simplify QA evidence profile and mappings/coverage shape ( #93153 )
...
* test(qa): simplify evidence coverage shape
* test(qa): collapse evidence scorecard metadata
* test(qa): document evidence schema version
2026-06-14 22:26:58 -07:00
Dallin Romney
3d38c9a633
test(qa): embed profile scorecard evidence ( #93109 )
...
* test(qa): embed profile scorecard evidence
* test(qa): fix profile runner return lint
* test(qa): satisfy suite command lint return
2026-06-14 20:51:38 -07:00
Dallin Romney
e8db9c3bc0
test(qa): add qa run --profile and unified output summary/evidence ( #91587 )
...
* test(qa): add mapped qa run profiles
* test(qa): document mapped profile runner
* test(qa): validate run profiles from mapping
* test(qa): preserve root profile parsing
* test(qa): simplify taxonomy profile dispatch
* test(qa): align tool coverage CLI expectation
* test(qa): fix profile dispatch fixture type
* test(qa): share profile runner option types
* test(qa): split shared cli runner options
* test(qa): unify profile suite artifacts
* fix(qa): filter profile scenarios by provider lane
* test(qa): drop native scenario subreports
* fix(qa): keep native log refs repo-relative
* fix(cli): preserve qa run root profile parsing
* fix(qa): avoid qa profile flag collision
* fix(qa): reject profile flags without qa profile
2026-06-14 18:08:42 -07:00
Dallin Romney
fef8394079
Convert QA scenarios to YAML files ( #92915 )
...
* refactor: load QA scenarios from YAML
* docs: update personal QA scenario docs
* test: keep QA scenarios YAML-only
2026-06-14 17:31:18 -07:00
NVIDIAN
ecaebfc51b
fix(agents): retry thinking-only errored turns ( #92191 )
...
Retry replay-safe reasoning-only provider errors before assistant failover while preserving classified fallback and terminal-output ownership. Adds deterministic Anthropic gateway fault-injection coverage and focused regression tests.\n\nCo-authored-by: ai-hpc <mail.speedy.hpc@hotmail.com >
2026-06-14 09:39:27 -07:00
Dallin Romney
1affe4fcdf
Fold Telegram RTT sampling into live QA evidence ( #92550 )
...
* refactor(qa): fold telegram rtt into live evidence
* test: default package telegram rtt samples
* refactor(qa-lab): fold telegram rtt into live evidence
* fix(qa-lab): keep package telegram rtt optional for focused runs
* fix(qa-lab): avoid stale rtt evidence on failed samples
* fix(qa-lab): pass telegram live env into credential leasing
* fix(qa-lab): update telegram canary remediation artifacts
* docs(qa): remove stale telegram observed artifact guidance
* fix(qa-lab): clarify telegram empty-reply remediation
* fix(qa-lab): honor telegram rtt timeout
* ci(qa): drop stale telegram capture env
* refactor: align telegram evidence coverage fields
* fix: ignore stale telegram observed artifacts
* fix: preserve telegram rtt coverage mapping
* fix: omit unused telegram rtt catch binding
* docs: document telegram rtt check selector
2026-06-14 17:02:33 +08:00
Vincent Koc
b1caba5906
test(qa): align tool coverage CLI expectation
2026-06-14 16:11:00 +08:00
Dallin Romney
a3e9dfee0e
Simplify QA scorecard mapping shape ( #92558 )
...
* test(qa): simplify scorecard mapping shape
* test(qa): use typed scorecard evidence refs
* test(qa): map scorecard categories by coverage id
* feat: align qa coverage with taxonomy features
* refactor: keep qa coverage ids canonical
* refactor: minimize qa coverage id churn
* test: align qa coverage id assertions
* test: update qa evidence coverage expectations
* refactor qa taxonomy coverage ids
* style qa taxonomy coverage ids
* test qa coverage lint fix
* test qa coverage type fix
2026-06-14 00:16:33 -07:00
Vincent Koc
47759c3506
fix(qa): accept rich Telegram canary presence
...
(cherry picked from commit e86eb7567a )
2026-06-14 07:20:16 +08:00
Vincent Koc
924f4c1964
fix(qa): read rich Telegram replies in live checks
...
(cherry picked from commit 9c8b880353 )
2026-06-14 06:35:59 +08:00
Vincent Koc
4208c89ec4
test(qa-lab): align bootstrap selection assertion
2026-06-13 18:11:46 +08:00
Dallin Romney
561b293c7a
Run Vitest and Playwright scenarios from qa suite ( #92606 )
...
* test(qa): run vitest and playwright scenarios from qa suite
* fix(qa): harden scenario suite dispatch
* refactor(qa): share scenario path utilities
* refactor(qa): share test file scenario runner
* refactor(qa): route test file scenarios through suite runtime
* refactor(qa): use explicit suite runtime result kind
* test(qa): write suite evidence artifact
* refactor(qa): clarify suite execution dispatch
* fix(qa): keep test-file scenarios out of flow-only runners
* refactor(qa): export mixed scenario suite runner
2026-06-13 01:06:10 -07:00
Dallin Romney
d8b3e523ff
Add QA scorecard taxonomy validation ( #91500 )
...
Merged via squash.
Prepared head SHA: a9aec907d4
Co-authored-by: RomneyDa <6581799+RomneyDa@users.noreply.github.com >
Co-authored-by: RomneyDa <6581799+RomneyDa@users.noreply.github.com >
Reviewed-by: @RomneyDa
2026-06-12 17:07:51 -07:00
Dallin Romney
4809ac70fa
Add QA evidence artifact output ( #91484 )
...
* feat: add qa evidence summary normalization
* chore: rename qa evidence target environment
* chore: align qa evidence profile terminology
* chore: align qa evidence summary fields
* chore: add qa evidence taxonomy ref
* test: remove stale multipass evidence example
* test(qa): normalize vitest and playwright evidence
* test(qa): slim evidence summary metadata
* test(qa): clarify evidence summary inputs
* test(qa): rename scenario specs in evidence flow
* test(qa): treat evidence profiles as mapping strings
* test(qa): use neutral evidence test identity
* test(qa): nest evidence summary joins
* refactor(qa): normalize live evidence summaries
* fix(qa): accept normalized telegram rtt summaries
* fix(qa): normalize evidence lane summaries
* fix(qa): align evidence summaries with requirements
* refactor(qa): tighten evidence summary builders
* refactor(qa): restore standard evidence ids
* fix(qa): keep legacy summaries out of rtt evidence
* refactor(qa): make package evidence provenance explicit
* test(qa): keep script tests out of qa lab internals
* refactor(qa): rename scenario evidence definitions
* refactor(qa): clean evidence summary wording
* test(qa): fix evidence summary test inputs
* refactor(qa): simplify evidence identity fields
* refactor(qa): tighten evidence summary inputs
* refactor(qa): rename evidence artifact
2026-06-12 16:12:58 -07:00
Jesse Merhi
6223a538bc
fix(docker): bundle QA Lab runtime in the image ( #92087 )
...
* fix(docker): split qa lab runtime fixes
* fix(docker): remove store platform selector
* test(docker): assert qa lab ui copy is gated
2026-06-12 14:02:32 +10:00
Vincent Koc
8042ec4cb8
fix(qa): scope runtime parity mock requests
2026-06-11 05:19:10 +09:00
Vincent Koc
7d3e8dc963
test(qa): restore memory fallback config safely
2026-06-10 18:03:15 +09:00
Vincent Koc
7f1d82ab25
revert(sessions): defer session metadata sqlite
...
Reverts 538d36eaaa while preserving subsequent main changes. The beta-only SQLite downgrade rescue and reverse migration remain excluded.
2026-06-10 16:34:06 +09:00
brokemac79
de4b8d8ebf
feat(plugins): allow installed trusted policy contracts
...
Allow explicitly enabled installed plugins to register declared trusted tool policies and agent tool result middleware, with trusted policy ids scoped by plugin owner.\n\nVerification covered targeted plugin/agent tests, typecheck, build, lint, local autoreview, and a Blacksmith Testbox runtime proof (tbx_01ktr1nq0rhq47fjkwrepm7fd3).
2026-06-10 16:18:23 +10:00
Josh Avant
9f48254f09
Fix config.patch explicit array replacement ( #91551 )
...
* fix config patch explicit array replacement
* fix generated config patch protocol model
* fix config patch test helper typing
* fix shared auth patch replacement tests
* update config patch prompt snapshots
* harden qa lab config patch replace paths
2026-06-08 21:48:46 -05:00
Vincent Koc
50130d32a9
test(release): align qa tool coverage gate
2026-06-09 01:02:24 +02:00
Vincent Koc
c7b01cf201
test(release): stabilize qa runtime parity gate
2026-06-09 01:02:24 +02:00
Vincent Koc
1019b591d5
test(release): stabilize qa gateway restart readiness
2026-06-09 01:02:24 +02:00
Vincent Koc
505b23a137
fix(release): clear beta validation blockers
2026-06-09 01:02:22 +02:00
Peter Steinberger
538d36eaaa
refactor: move session metadata to SQLite ( #91322 )
...
* refactor: move session metadata to sqlite
* test: seed session stores with sqlite fixtures
* test: seed remaining session stores with sqlite fixtures
* fix: stabilize sqlite session cache freshness
* test: seed cli transcript metadata in sqlite
2026-06-07 23:17:35 -07:00
Marcus Castro
181238fb53
feat(whatsapp): expand live QA coverage ( #90480 )
...
* feat(whatsapp): expand qa driver message support
* feat(qa-lab): add deterministic whatsapp mock replies
* feat(qa-lab): expand whatsapp live qa scenarios
* docs(qa): document whatsapp live qa coverage
2026-06-08 00:03:23 -03:00