Commit Graph

3 Commits

Author SHA1 Message Date
Peter Steinberger
68b4b36a90 test: harden qa eval scenarios 2026-04-10 10:11:35 +01:00
Peter Steinberger
b5d2bd6f41 fix(qa): tighten frontier scope evals 2026-04-07 20:32:42 +01:00
Vincent Koc
4f421fa0f1 fix(qa): harden frontier claude bakeoffs 2026-04-07 20:32:42 +01:00