ci: add test performance agent

This commit is contained in:
Peter Steinberger
2026-04-23 20:40:42 +01:00
parent 6fc8913223
commit 360cb3dbf1
4 changed files with 330 additions and 0 deletions

View File

@@ -25,6 +25,16 @@ listed PRs when `apply=true`. Before mutating GitHub, it verifies that the
landed PR is merged and that each duplicate has either a shared referenced issue
or overlapping changed hunks.
The `Test Performance Agent` workflow is an event-driven Codex maintenance lane
for slow tests. It has no pure schedule: a successful non-bot push CI run on
`main` can trigger it, but it skips if another workflow-run invocation already
ran or is running that UTC day. Manual dispatch bypasses that daily activity
gate. The lane builds a full-suite grouped Vitest performance report, lets Codex
make only small coverage-preserving test performance fixes, then reruns the
full-suite report and rejects changes that reduce the passing baseline test
count. If the baseline has failing tests, Codex may fix only obvious failures
and the after-agent full-suite report must pass before anything is committed.
```bash
gh workflow run duplicate-after-merge.yml \
-f landed_pr=70532 \
@@ -56,6 +66,7 @@ gh workflow run duplicate-after-merge.yml \
| `macos-node` | macOS TypeScript test lane using the shared built artifacts | macOS-relevant changes |
| `macos-swift` | Swift lint, build, and tests for the macOS app | macOS-relevant changes |
| `android` | Android unit tests for both flavors plus one debug APK build | Android-relevant changes |
| `test-performance-agent` | Daily Codex slow-test optimization after trusted activity | Main CI success or manual dispatch |
## Fail-Fast Order
@@ -111,4 +122,6 @@ pnpm check:docs # docs format + lint + broken links
pnpm build # build dist when CI artifact/build-smoke lanes matter
node scripts/ci-run-timings.mjs <run-id> # summarize wall time, queue time, and slowest jobs
node scripts/ci-run-timings.mjs --recent 10 # compare recent successful main CI runs
pnpm test:perf:groups --full-suite --allow-failures --output .artifacts/test-perf/baseline-before.json
pnpm test:perf:groups:compare .artifacts/test-perf/baseline-before.json .artifacts/test-perf/after-agent.json
```

View File

@@ -29,6 +29,8 @@ title: "Tests"
- `pnpm test:perf:changed:bench -- --worktree` benchmarks the current worktree change set without committing first.
- `pnpm test:perf:profile:main`: writes a CPU profile for the Vitest main thread (`.artifacts/vitest-main-profile`).
- `pnpm test:perf:profile:runner`: writes CPU + heap profiles for the unit runner (`.artifacts/vitest-runner-profile`).
- `pnpm test:perf:groups --full-suite --allow-failures --output .artifacts/test-perf/baseline-before.json`: runs every full-suite Vitest leaf config serially and writes grouped duration data plus per-config JSON/log artifacts. The Test Performance Agent uses this as its baseline before attempting slow-test fixes.
- `pnpm test:perf:groups:compare .artifacts/test-perf/baseline-before.json .artifacts/test-perf/after-agent.json`: compares grouped reports after a performance-focused change.
- Gateway integration: opt-in via `OPENCLAW_TEST_INCLUDE_GATEWAY=1 pnpm test` or `pnpm test:gateway`.
- `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
- `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.