Commit Graph

30449 Commits

Author SHA1 Message Date
Marcus Castro
aaae1aeb8f fix(whatsapp): route react through gateway (#64638)
* fix(whatsapp): route react through gateway

* fix(gateway): accept full message action tool context
2026-04-11 11:38:10 -03:00
Peter Steinberger
545490c592 fix: handle codex app-server interrupt shutdown 2026-04-11 15:20:52 +01:00
Peter Steinberger
b489c8f55b test: resolve Parallels npm update Python 2026-04-11 14:57:59 +01:00
Vincent Koc
8a7ad8f0e0 fix(msteams): remove reaction handler type cycle 2026-04-11 14:55:25 +01:00
Vincent Koc
b9a0052dd0 fix(cycles): split embedded runner and setup leaf types 2026-04-11 14:49:48 +01:00
Peter Steinberger
24a5ba732f fix: harden docker smoke packaging 2026-04-11 14:40:01 +01:00
Peter Steinberger
ccfc97c235 test(channel-setup): mock channel metadata source 2026-04-11 14:31:00 +01:00
Peter Steinberger
e1b674cbf1 build: stabilize a2ui bundle hash 2026-04-11 14:29:02 +01:00
Peter Steinberger
0dd4958bc8 test(install): harden docker tgz smoke flow 2026-04-11 14:27:34 +01:00
Peter Steinberger
b7cc064961 fix(update): exclude private QA sidecars from package verify 2026-04-11 14:27:33 +01:00
Peter Steinberger
1f69790bed docs: note GPT-5.4 parity harness landing 2026-04-11 14:22:48 +01:00
Eva
108e5c89de qa-lab: scope parity metrics and harden fake-success detector
- scope computeQaAgenticParityMetrics to QA_AGENTIC_PARITY_SCENARIO_TITLES
  in buildQaAgenticParityComparison so extra non-parity lanes in a full
  qa-suite-summary.json cannot influence completion / unintended-stop /
  valid-tool / fake-success rates
- filter coverageMismatch by !parityTitleSet.has(name) so each required
  parity scenario fails the gate exactly once (from requiredScenarioCoverage)
  instead of being double-reported as a coverage mismatch too
- drop the bare /\\berror\\b/i rule from SUSPICIOUS_PASS_PATTERNS — it was
  false-flagging legitimate passes that narrate "Error budget: 0" or
  "no errors found" — and replace it with targeted /error occurred/i and
  /an error was/i phrases that indicate a real mid-turn error
- add regressions: error-budget/no-errors-observed passes yield
  fakeSuccessCount === 0, genuine error-occurred narration still flags,
  each missing required scenario fires exactly one failure line, and
  non-parity lanes do not perturb scoped metrics
- isolate the baseline suspicious-pass test by padding it to the full
  first-wave scenario set so it asserts the isolated fake-success path
  via toEqual([...]) rather than toContain
2026-04-11 14:22:48 +01:00
Eva
95f8ad215f Treat skipped parity scenarios as uncovered 2026-04-11 14:22:48 +01:00
Eva
17252df122 Tighten parity proof heuristics 2026-04-11 14:22:48 +01:00
Eva
fd45ea2bf1 test(qa): add compaction retry parity scenario 2026-04-11 14:22:48 +01:00
Eva
3211aa2540 fix(qa): surface missing required scenarios in parity report 2026-04-11 14:22:48 +01:00
Eva
55df6f11a4 fix: harden parity gate review findings 2026-04-11 14:22:48 +01:00
Eva
c73d005c7a docs: clarify parity verdict interpretation 2026-04-11 14:22:48 +01:00
Eva
db09edacfc qa-lab: gate parity on shared scenario coverage 2026-04-11 14:22:48 +01:00
Eva
67fdd3b4df benchmarks: add agentic parity report gate 2026-04-11 14:22:48 +01:00
Eva
79f539d9ce docs: clarify GPT-5.4 parity harness and review flow 2026-04-11 14:22:48 +01:00
Eva
d9c7ddb099 test: add agentic parity scenario pack 2026-04-11 14:22:48 +01:00
Peter Steinberger
0d733a28e1 build(canvas): refresh a2ui input hash 2026-04-11 14:19:51 +01:00
Peter Steinberger
a8284e39de build(canvas): stabilize a2ui bundle inputs 2026-04-11 14:19:25 +01:00
Peter Steinberger
9bde608f38 build: keep a2ui bundle generated 2026-04-11 14:18:04 +01:00
Peter Steinberger
0ed512bbdf build: refresh a2ui bundle 2026-04-11 14:18:04 +01:00
Vincent Koc
935bd6de7f fix(gateway): split credential secret input runtime 2026-04-11 14:15:42 +01:00
Peter Steinberger
85fa33d9d7 style: apply formatter drift 2026-04-11 14:08:55 +01:00
Peter Steinberger
2ffc19720b fix: restore channel auto-enable metadata 2026-04-11 14:08:55 +01:00
Peter Steinberger
40beb68fb0 chore: remove legacy shim packages 2026-04-11 14:07:29 +01:00
Peter Steinberger
419ab38ea2 test(msteams): stabilize oauth expiry assertion 2026-04-11 14:07:21 +01:00
Peter Steinberger
eb7bdbf980 docs: remove extension changelogs 2026-04-11 14:05:07 +01:00
Peter Steinberger
b646655a2d fix(ci): preserve channel auto-enable metadata 2026-04-11 14:03:08 +01:00
Peter Steinberger
564f64666b docs: remove plugin version-only changelog entries 2026-04-11 14:01:40 +01:00
Vincent Koc
759b5aa764 fix(cycles): narrow config type imports 2026-04-11 14:01:09 +01:00
Peter Steinberger
88be9b525c docs: update 2026.4.11 changelog 2026-04-11 14:00:42 +01:00
Peter Steinberger
9a8647cef7 fix: remove duplicate channel runtime export 2026-04-11 13:56:37 +01:00
Peter Steinberger
a82d8f04fb fix: clear rebase lint issues 2026-04-11 13:55:08 +01:00
Peter Steinberger
bf82a7c46e fix: keep browser cdp range wide for high ports 2026-04-11 13:55:08 +01:00
Peter Steinberger
4ca458b182 fix: preserve googlechat doctor semantics 2026-04-11 13:55:08 +01:00
Peter Steinberger
627ab39b6d perf: stabilize agent lane hotspots 2026-04-11 13:55:08 +01:00
Peter Steinberger
30e646ffab test: finish import performance cleanup 2026-04-11 13:55:08 +01:00
Peter Steinberger
ff7a842509 perf: reduce command and gateway test imports 2026-04-11 13:55:08 +01:00
Peter Steinberger
8ddd9b8aac perf: narrow plugin config test surfaces 2026-04-11 13:55:08 +01:00
Peter Steinberger
bb0bfabec8 perf: trim agent test runtime imports 2026-04-11 13:55:07 +01:00
Peter Steinberger
5915d7cb6b perf: optimize messaging plugin tests 2026-04-11 13:55:07 +01:00
Peter Steinberger
c7f18d9278 test: dedupe media provider tests 2026-04-11 13:55:07 +01:00
Peter Steinberger
baeec2f4b2 fix(ci): clean up rebased registry types 2026-04-11 13:54:07 +01:00
Peter Steinberger
684ce920fd fix(ci): restore channel public type exports 2026-04-11 13:54:07 +01:00
Vincent Koc
3b4de1ac14 fix(cycles): split reply and gateway leaf seams 2026-04-11 13:53:20 +01:00