Eva
|
55df6f11a4
|
fix: harden parity gate review findings
|
2026-04-11 14:22:48 +01:00 |
|
Eva
|
67fdd3b4df
|
benchmarks: add agentic parity report gate
|
2026-04-11 14:22:48 +01:00 |
|
Eva
|
d9c7ddb099
|
test: add agentic parity scenario pack
|
2026-04-11 14:22:48 +01:00 |
|
Gustavo Madeira Santana
|
25445a9f2e
|
qa-lab: add Matrix live transport QA lane (#64489)
Merged via squash.
Prepared head SHA: ae9bb37751
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
|
2026-04-10 19:35:08 -04:00 |
|
Ayaan Zaidi
|
ecb3e0a62d
|
fix(qa-lab): harden telegram qa artifacts
|
2026-04-10 21:53:31 +05:30 |
|
Ayaan Zaidi
|
2aaf5a3baa
|
fix(qa-lab): address telegram qa review comments
|
2026-04-10 21:53:31 +05:30 |
|
Ayaan Zaidi
|
e093cb6c93
|
feat(qa-lab): add telegram live qa lane
|
2026-04-10 21:53:31 +05:30 |
|
Peter Steinberger
|
07e7222e28
|
test: split Claude CLI QA auth modes
|
2026-04-10 14:56:36 +01:00 |
|
Peter Steinberger
|
4c14f55c62
|
test: parallelize QA suite scenarios
|
2026-04-10 13:45:57 +01:00 |
|
Vincent Koc
|
dbe2a97e80
|
fix(cycles): remove qa-lab and ui runtime seams
|
2026-04-10 11:45:27 +01:00 |
|
Shakker
|
def2eadb1d
|
feat: add multipass runner to qa suite
|
2026-04-09 23:53:13 +01:00 |
|
Peter Steinberger
|
39cc6b7dc7
|
fix: stabilize character eval and Qwen model routing
|
2026-04-09 01:04:09 +01:00 |
|
Peter Steinberger
|
21ef1bf8de
|
feat: parallelize character eval runs
|
2026-04-08 20:05:55 +01:00 |
|
Peter Steinberger
|
4a51a1031d
|
feat: add character eval model options
|
2026-04-08 17:05:30 +01:00 |
|
Peter Steinberger
|
3101d81053
|
feat: add QA character eval reports
|
2026-04-08 15:52:49 +01:00 |
|
Vincent Koc
|
76bde3d42b
|
fix(qa): support neutral-cwd docker commands
|
2026-04-07 20:32:42 +01:00 |
|
Vincent Koc
|
816a3eae8a
|
chore(qa): align qa cli provider input types
|
2026-04-07 20:32:42 +01:00 |
|
Vincent Koc
|
5aa4fd3216
|
fix(qa): normalize qa cli lane inputs
|
2026-04-07 20:32:42 +01:00 |
|
Vincent Koc
|
7d18b145f8
|
fix(qa): keep manual alternate model aligned
|
2026-04-07 20:32:42 +01:00 |
|
Vincent Koc
|
cdf18c16b4
|
fix(qa): default manual lanes by provider mode
|
2026-04-07 20:32:42 +01:00 |
|
Vincent Koc
|
9a106f7e3c
|
fix(qa): support neutral-cwd suite runs
|
2026-04-07 20:32:42 +01:00 |
|
Vincent Koc
|
f93b217834
|
feat(qa): add manual harness lane
|
2026-04-07 20:32:42 +01:00 |
|
Vincent Koc
|
18fb171179
|
feat(qa): add frontier harness bakeoff loop
|
2026-04-07 20:32:41 +01:00 |
|
Peter Steinberger
|
f2494aa33f
|
feat: streamline qa lab live runs
|
2026-04-07 10:05:49 +01:00 |
|
Peter Steinberger
|
54a884865e
|
feat: add fast qa lab ui refresh mode
|
2026-04-07 09:45:11 +01:00 |
|
Peter Steinberger
|
cfebdee073
|
refactor: dedupe qa cli shutdown handling
|
2026-04-06 22:21:01 +01:00 |
|
Peter Steinberger
|
b4e1747391
|
feat: add one-command qa lab docker launcher
|
2026-04-06 17:47:17 +01:00 |
|
Peter Steinberger
|
508024ae3b
|
feat(qa): add live suite runner and harness
|
2026-04-06 01:03:21 +01:00 |
|
Peter Steinberger
|
4780788bbb
|
feat(qa): add repo-backed qa suite runner
|
2026-04-05 23:21:56 +01:00 |
|
Peter Steinberger
|
8e1c81e707
|
feat(qa): recreate qa lab docker stack
|
2026-04-05 23:21:56 +01:00 |
|
Peter Steinberger
|
bb60b53124
|
feat: add qa lab extension
|
2026-04-05 23:21:56 +01:00 |
|