ci: add Matrix QA profiles

This commit is contained in:
Peter Steinberger
2026-04-27 05:43:10 +01:00
parent 382e03a2d8
commit 6987132aed
23 changed files with 446 additions and 48 deletions

View File

@@ -145,9 +145,13 @@ QA Lab has dedicated CI lanes outside the main smart-scoped workflow. The
builds the private QA runtime and compares the mock GPT-5.5 and Opus 4.6
agentic packs. The `QA-Lab - All Lanes` workflow runs nightly on `main` and on
manual dispatch; it fans out the mock parity gate, live Matrix lane, and live
Telegram lane as parallel jobs. The live jobs use the `qa-live-shared`
environment, and the Telegram lane uses Convex leases. `OpenClaw Release
Checks` also runs the same QA Lab lanes before release approval.
Telegram and Discord lanes as parallel jobs. The live jobs use the
`qa-live-shared` environment, and Telegram/Discord use Convex leases. Matrix
uses `--profile fast --fail-fast` for scheduled and release gates while the CLI
default and manual workflow input remain `all`; manual all-lanes dispatch can
shard full Matrix coverage into `transport`, `media`, `e2ee-smoke`,
`e2ee-deep`, and `e2ee-cli` jobs. `OpenClaw Release Checks` also runs the
release-critical QA Lab lanes before release approval.
The `Duplicate PRs After Merge` workflow is a manual maintainer workflow for
post-land duplicate cleanup. It defaults to dry-run and only closes explicitly

View File

@@ -73,7 +73,7 @@ instrumentation.
For a transport-real Matrix smoke lane, run:
```bash
pnpm openclaw qa matrix
pnpm openclaw qa matrix --profile fast --fail-fast
```
That lane provisions a disposable Tuwunel homeserver in Docker, registers
@@ -84,9 +84,15 @@ the child config scoped to the transport under test, so Matrix runs without
a combined stdout/stderr log into the selected Matrix QA output directory. To
capture the outer `scripts/run-node.mjs` build/launcher output too, set
`OPENCLAW_RUN_NODE_OUTPUT_LOG=<path>` to a repo-local log file.
Matrix progress is printed by default. `OPENCLAW_QA_MATRIX_TIMEOUT_MS` bounds
the full run, and `OPENCLAW_QA_MATRIX_CLEANUP_TIMEOUT_MS` bounds cleanup so a
stuck Docker teardown reports the exact recovery command instead of hanging.
Matrix progress is printed by default. The CLI default profile is `all`, so
plain `pnpm openclaw qa matrix` still runs the full catalog. Use `--profile
fast` for the release-critical transport contract, or shard full coverage with
`transport`, `media`, `e2ee-smoke`, `e2ee-deep`, and `e2ee-cli`. `--fail-fast`
stops after the first failed scenario when you want a release gate instead of a
full inventory. `OPENCLAW_QA_MATRIX_TIMEOUT_MS` bounds the full run,
`OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS` can shorten no-reply quiet windows for
CI, and `OPENCLAW_QA_MATRIX_CLEANUP_TIMEOUT_MS` bounds cleanup so a stuck
Docker teardown reports the exact recovery command instead of hanging.
For a transport-real Telegram smoke lane, run:

View File

@@ -92,9 +92,13 @@ These commands sit beside the main test suites when you need QA-lab realism:
CI runs QA Lab in dedicated workflows. `Parity gate` runs on matching PRs and
from manual dispatch with mock providers. `QA-Lab - All Lanes` runs nightly on
`main` and from manual dispatch with the mock parity gate, live Matrix lane, and
Convex-managed live Telegram lane as parallel jobs. `OpenClaw Release Checks`
runs the same lanes before release approval.
`main` and from manual dispatch with the mock parity gate, live Matrix lane,
Convex-managed live Telegram lane, and Convex-managed live Discord lane as
parallel jobs. Scheduled QA and release checks pass Matrix `--profile fast`
explicitly, while the Matrix CLI and manual workflow input default remain
`all`; manual dispatch can shard `all` into `transport`, `media`, `e2ee-smoke`,
`e2ee-deep`, and `e2ee-cli` jobs. `OpenClaw Release Checks` runs parity plus
the fast Matrix and Telegram lanes before release approval.
- `pnpm openclaw qa suite`
- Runs repo-backed QA scenarios directly on the host.
@@ -248,10 +252,11 @@ gh workflow run package-acceptance.yml --ref main \
- Repo checkouts load the bundled runner directly; no separate plugin install
step is needed.
- Provisions three temporary Matrix users (`driver`, `sut`, `observer`) plus one private room, then starts a QA gateway child with the real Matrix plugin as the SUT transport.
- Defaults to `--profile all`. Use `--profile fast --fail-fast` for release-critical transport proof, or `--profile transport|media|e2ee-smoke|e2ee-deep|e2ee-cli` when sharding the full catalog.
- Uses the pinned stable Tuwunel image `ghcr.io/matrix-construct/tuwunel:v1.5.1` by default. Override with `OPENCLAW_QA_MATRIX_TUWUNEL_IMAGE` when you need to test a different image.
- Matrix does not expose shared credential-source flags because the lane provisions disposable users locally.
- Writes a Matrix QA report, summary, observed-events artifact, and combined stdout/stderr output log under `.artifacts/qa-e2e/...`.
- Emits progress by default and enforces a hard run timeout with `OPENCLAW_QA_MATRIX_TIMEOUT_MS` (default 30 minutes). Cleanup is bounded by `OPENCLAW_QA_MATRIX_CLEANUP_TIMEOUT_MS` and failures include the recovery `docker compose ... down --remove-orphans` command.
- Emits progress by default and enforces a hard run timeout with `OPENCLAW_QA_MATRIX_TIMEOUT_MS` (default 30 minutes). `OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS` tunes negative no-reply quiet windows, and cleanup is bounded by `OPENCLAW_QA_MATRIX_CLEANUP_TIMEOUT_MS` with failures including the recovery `docker compose ... down --remove-orphans` command.
- `pnpm openclaw qa telegram`
- Runs the Telegram live QA lane against a real private group using the driver and SUT bot tokens from env.
- Requires `OPENCLAW_QA_TELEGRAM_GROUP_ID`, `OPENCLAW_QA_TELEGRAM_DRIVER_BOT_TOKEN`, and `OPENCLAW_QA_TELEGRAM_SUT_BOT_TOKEN`. The group id must be the numeric Telegram chat id.
@@ -267,10 +272,11 @@ Live transport lanes share one standard contract so new transports do not drift:
`qa-channel` remains the broad synthetic QA suite and is not part of the live
transport coverage matrix.
| Lane | Canary | Mention gating | Allowlist block | Top-level reply | Restart resume | Thread follow-up | Thread isolation | Reaction observation | Help command |
| -------- | ------ | -------------- | --------------- | --------------- | -------------- | ---------------- | ---------------- | -------------------- | ------------ |
| Matrix | x | x | x | x | x | x | x | x | |
| Telegram | x | | | | | | | | x |
| Lane | Canary | Mention gating | Allowlist block | Top-level reply | Restart resume | Thread follow-up | Thread isolation | Reaction observation | Help command | Native command registration |
| -------- | ------ | -------------- | --------------- | --------------- | -------------- | ---------------- | ---------------- | -------------------- | ------------ | --------------------------- |
| Matrix | x | x | x | x | x | x | x | x | | |
| Telegram | x | x | | | | | | | x | |
| Discord | x | x | | | | | | | | x |
### Shared Telegram credentials via Convex (v1)

View File

@@ -137,9 +137,12 @@ the maintainer-only release runbook.
- Run `pnpm release:check` before every tagged release
- Release checks now run in a separate manual workflow:
`OpenClaw Release Checks`
- `OpenClaw Release Checks` also runs the QA Lab mock parity gate plus the live
Matrix and Telegram QA lanes before release approval. The live lanes use the
`qa-live-shared` environment; Telegram also uses Convex CI credential leases.
- `OpenClaw Release Checks` also runs the QA Lab mock parity gate plus the fast
live Matrix profile and Telegram QA lane before release approval. The live
lanes use the `qa-live-shared` environment; Telegram also uses Convex CI
credential leases. Run the manual `QA-Lab - All Lanes` workflow with
`matrix_profile=all` and `matrix_shards=true` when you want full Matrix
transport, media, and E2EE inventory in parallel.
- Cross-OS install and upgrade runtime validation is dispatched from the
private caller workflow
`openclaw/releases-private/.github/workflows/openclaw-cross-os-release-checks.yml`,
@@ -338,13 +341,14 @@ Release QA Lab coverage includes:
- mock parity gate comparing the OpenAI candidate lane against the Opus 4.6
baseline using the agentic parity pack
- live Matrix QA lane using the `qa-live-shared` environment
- fast live Matrix QA profile using the `qa-live-shared` environment
- live Telegram QA lane using Convex CI credential leases
- `pnpm qa:otel:smoke` when release telemetry needs explicit local proof
Use this box to answer "does the release behave correctly in QA scenarios and
live channel flows?" Keep the artifact URLs for parity, Matrix, and Telegram
lanes when approving the release.
lanes when approving the release. Full Matrix coverage remains available as a
manual sharded QA-Lab run rather than the default release-critical lane.
### Package