openclaw

mirror of https://github.com/openclaw/openclaw.git synced 2026-04-12 09:41:11 +00:00

Files

Eva 108e5c89de qa-lab: scope parity metrics and harden fake-success detector

- scope computeQaAgenticParityMetrics to QA_AGENTIC_PARITY_SCENARIO_TITLES
  in buildQaAgenticParityComparison so extra non-parity lanes in a full
  qa-suite-summary.json cannot influence completion / unintended-stop /
  valid-tool / fake-success rates
- filter coverageMismatch by !parityTitleSet.has(name) so each required
  parity scenario fails the gate exactly once (from requiredScenarioCoverage)
  instead of being double-reported as a coverage mismatch too
- drop the bare /\\berror\\b/i rule from SUSPICIOUS_PASS_PATTERNS — it was
  false-flagging legitimate passes that narrate "Error budget: 0" or
  "no errors found" — and replace it with targeted /error occurred/i and
  /an error was/i phrases that indicate a real mid-turn error
- add regressions: error-budget/no-errors-observed passes yield
  fakeSuccessCount === 0, genuine error-occurred narration still flags,
  each missing required scenario fires exactly one failure line, and
  non-parity lanes do not perturb scoped metrics
- isolate the baseline suspicious-pass test by padding it to the full
  first-wave scenario set so it asserts the isolated fake-success path
  via toEqual([...]) rather than toContain

2026-04-11 14:22:48 +01:00