Commit Graph

113 Commits

Author SHA1 Message Date
Altay
c6d0baf562 qa-lab: use OpenClaw tmp dir for multipass staging 2026-04-10 00:09:48 +01:00
Shakker
b88387e4c1 fix: harden qa multipass runner 2026-04-09 23:53:13 +01:00
Shakker
445fe55331 fix: validate multipass output paths 2026-04-09 23:53:13 +01:00
Shakker
def2eadb1d feat: add multipass runner to qa suite 2026-04-09 23:53:13 +01:00
Altay
8cf02e7c47 fix(ci): clear check-additional follow-up regressions (#63934)
* fix(ci): route messaging temp files through openclaw tmp dir

* fix(ci): clear qa-lab follow-up guardrails

* fix(ci): own-check ACP fallback resolvers

* fix(ci): preserve memory-core write error causes

* fix(ci): narrow qa-channel boundary alias

* fix(test): type memory-core dreaming api stubs
2026-04-09 23:47:59 +01:00
Josh Lehman
bd639bbde8 fix: resolve qa-lab type-aware linting (#63928)
Regeneration-Prompt: |
  Fix the unrelated qa-lab failures that started surfacing once bundled extension linting covered the QA channel types. Keep the change minimal and additive. Preserve the existing plugin-sdk import surface for qa-lab, but make sure the generated qa-channel plugin-sdk declarations can be resolved from bundled extension package-boundary tsconfig paths. Also replace the over-broad QaBusEventSeed union in qa-lab bus state with an explicit discriminated union so oxlint no longer treats the event variants as duplicate constituents. Verify with the qa-lab package typecheck, a targeted type-aware oxlint run for the affected files, full pnpm check, and the focused qa-lab bus-state test.
2026-04-09 14:33:33 -07:00
Peter Steinberger
719f06510c chore: bump version to 2026.4.10 2026-04-09 03:56:22 +01:00
Peter Steinberger
0c278bb93c refactor: break runtime import cycles 2026-04-09 03:56:22 +01:00
Peter Steinberger
be46d0ddc6 test: update character eval public panel 2026-04-09 01:25:59 +01:00
Peter Steinberger
39cc6b7dc7 fix: stabilize character eval and Qwen model routing 2026-04-09 01:04:09 +01:00
Shakker
d66e2d5b33 test: cover curated qa missing-key reply classification 2026-04-08 21:55:39 +01:00
Shakker
c63d25bd9b fix: classify curated qa missing-key replies 2026-04-08 21:55:39 +01:00
Shakker
9cfa152962 test: cover mixed-traffic qa wait cursors 2026-04-08 21:55:39 +01:00
Shakker
204d766b27 fix: align qa wait cursor semantics 2026-04-08 21:55:39 +01:00
Shakker
a6d76df4f0 test: cover qa scenario wait failure replies 2026-04-08 21:55:39 +01:00
Shakker
b3f3cfd598 fix: fail fast across qa scenario wait paths 2026-04-08 21:55:39 +01:00
Shakker
491e216c45 fix: fail fast on qa live auth errors 2026-04-08 21:55:39 +01:00
Peter Steinberger
21ef1bf8de feat: parallelize character eval runs 2026-04-08 20:05:55 +01:00
Peter Steinberger
f1e75d3259 fix: load QA live provider overrides 2026-04-08 20:05:55 +01:00
Peter Steinberger
a3d21539ef test: stabilize full-suite execution 2026-04-08 19:40:57 +01:00
Peter Steinberger
4a51a1031d feat: add character eval model options 2026-04-08 17:05:30 +01:00
Peter Steinberger
4bbf78e566 test: make character eval scenario natural 2026-04-08 17:05:30 +01:00
Peter Steinberger
3101d81053 feat: add QA character eval reports 2026-04-08 15:52:49 +01:00
Peter Steinberger
aa3b1357cb fix: support Codex CLI QA auth 2026-04-08 15:52:01 +01:00
Peter Steinberger
97dfbe0fe1 feat: add qa character vibes eval 2026-04-08 12:05:24 +01:00
Peter Steinberger
9bf3482470 refactor: finish markdown-only qa runner 2026-04-08 11:56:02 +01:00
Peter Steinberger
95e397a266 refactor: dedupe repeated test helpers 2026-04-08 09:58:22 +01:00
Peter Steinberger
492e98a88a refactor: move qa suite logic into scenario markdown 2026-04-08 09:13:57 +01:00
Vincent Koc
be530f085d refactor(plugin-sdk): share tool payload extraction 2026-04-08 09:07:28 +01:00
Vincent Koc
4260ac4cf6 perf(plugins): narrow boundary compile sdk imports 2026-04-08 08:52:51 +01:00
Peter Steinberger
21d9bac5ec fix: stabilize live qa scenario suite 2026-04-08 08:17:59 +01:00
Peter Steinberger
8cbd60d203 chore: prepare 2026.4.9 release 2026-04-08 08:02:53 +01:00
Peter Steinberger
b73d8ef7d7 refactor: split qa scenarios into per-file markdown defs 2026-04-08 05:37:17 +01:00
Peter Steinberger
4f8471617a chore: prepare 2026.4.8 2026-04-08 04:21:51 +01:00
Peter Steinberger
0e91c25c0b chore: prepare 2026.4.7 2026-04-08 02:14:59 +01:00
Peter Steinberger
c0aed59fca refactor: move qa suite definitions into markdown 2026-04-07 23:39:50 +01:00
Peter Steinberger
e0ad3e79e6 refactor: dedupe normalization lowercase helpers 2026-04-07 22:57:52 +01:00
Peter Steinberger
83bdba2bae fix: resolve rebase regressions for ci landing 2026-04-07 21:02:04 +01:00
Peter Steinberger
5b090561fb refactor: dedupe browser whatsapp qa lowercase helpers 2026-04-07 20:58:01 +01:00
Peter Steinberger
a00b01f5ed fix: harden complex qa suite scenarios 2026-04-07 20:35:39 +01:00
Peter Steinberger
b5d2bd6f41 fix(qa): tighten frontier scope evals 2026-04-07 20:32:42 +01:00
Peter Steinberger
4e69a9b329 fix(qa): restore safe no-fork gateway runtime 2026-04-07 20:32:42 +01:00
Vincent Koc
cde12e63e7 perf(qa): lazy-load runner catalog for lab ui 2026-04-07 20:32:42 +01:00
Vincent Koc
f312d6c106 fix(qa): preserve gateway cli auth in no-fork rpc path 2026-04-07 20:32:42 +01:00
Vincent Koc
e7538b4499 perf(qa): drop per-rpc gateway cli forks 2026-04-07 20:32:42 +01:00
Vincent Koc
02bd9e8c10 perf(qa): trim frontier direct-agent waits 2026-04-07 20:32:42 +01:00
Vincent Koc
35eb70f1f5 test(qa): retry flaky local fetches in lab server tests 2026-04-07 20:32:42 +01:00
Vincent Koc
986536ff6b fix(qa): keep direct self-check outputs under repo root 2026-04-07 20:32:42 +01:00
Vincent Koc
f6544a0a3b fix(qa): anchor runner artifacts to repo root 2026-04-07 20:32:42 +01:00
Vincent Koc
daeff2fa89 fix(qa): default docker artifacts from repo root 2026-04-07 20:32:42 +01:00