ci: add test performance agent

2026-05-16 08:50:44 +00:00 · 2026-04-23 20:40:42 +01:00
parent 6fc8913223
commit 360cb3dbf1
4 changed files with 330 additions and 0 deletions
--- a/docs/ci.md
+++ b/docs/ci.md
@@ -25,6 +25,16 @@ listed PRs when `apply=true`. Before mutating GitHub, it verifies that the
 landed PR is merged and that each duplicate has either a shared referenced issue
 or overlapping changed hunks.

+The `Test Performance Agent` workflow is an event-driven Codex maintenance lane
+for slow tests. It has no pure schedule: a successful non-bot push CI run on
+`main` can trigger it, but it skips if another workflow-run invocation already
+ran or is running that UTC day. Manual dispatch bypasses that daily activity
+gate. The lane builds a full-suite grouped Vitest performance report, lets Codex
+make only small coverage-preserving test performance fixes, then reruns the
+full-suite report and rejects changes that reduce the passing baseline test
+count. If the baseline has failing tests, Codex may fix only obvious failures
+and the after-agent full-suite report must pass before anything is committed.
+
 ```bash
 gh workflow run duplicate-after-merge.yml \
  -f landed_pr=70532 \
@@ -56,6 +66,7 @@ gh workflow run duplicate-after-merge.yml \
 | `macos-node`                     | macOS TypeScript test lane using the shared built artifacts                                  | macOS-relevant changes               |
 | `macos-swift`                    | Swift lint, build, and tests for the macOS app                                               | macOS-relevant changes               |
 | `android`                        | Android unit tests for both flavors plus one debug APK build                                 | Android-relevant changes             |
+| `test-performance-agent`         | Daily Codex slow-test optimization after trusted activity                                    | Main CI success or manual dispatch   |

 ## Fail-Fast Order

@@ -111,4 +122,6 @@ pnpm check:docs     # docs format + lint + broken links
 pnpm build          # build dist when CI artifact/build-smoke lanes matter
 node scripts/ci-run-timings.mjs <run-id>      # summarize wall time, queue time, and slowest jobs
 node scripts/ci-run-timings.mjs --recent 10   # compare recent successful main CI runs
+pnpm test:perf:groups --full-suite --allow-failures --output .artifacts/test-perf/baseline-before.json
+pnpm test:perf:groups:compare .artifacts/test-perf/baseline-before.json .artifacts/test-perf/after-agent.json
 ```
--- a/docs/reference/test.md
+++ b/docs/reference/test.md
@@ -29,6 +29,8 @@ title: "Tests"
 - `pnpm test:perf:changed:bench -- --worktree` benchmarks the current worktree change set without committing first.
 - `pnpm test:perf:profile:main`: writes a CPU profile for the Vitest main thread (`.artifacts/vitest-main-profile`).
 - `pnpm test:perf:profile:runner`: writes CPU + heap profiles for the unit runner (`.artifacts/vitest-runner-profile`).
+- `pnpm test:perf:groups --full-suite --allow-failures --output .artifacts/test-perf/baseline-before.json`: runs every full-suite Vitest leaf config serially and writes grouped duration data plus per-config JSON/log artifacts. The Test Performance Agent uses this as its baseline before attempting slow-test fixes.
+- `pnpm test:perf:groups:compare .artifacts/test-perf/baseline-before.json .artifacts/test-perf/after-agent.json`: compares grouped reports after a performance-focused change.
 - Gateway integration: opt-in via `OPENCLAW_TEST_INCLUDE_GATEWAY=1 pnpm test` or `pnpm test:gateway`.
 - `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
 - `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.