ci(release): split release soak validation

This commit is contained in:
Peter Steinberger
2026-05-04 23:24:43 +01:00
parent ac3cd1a0ca
commit 358cd87ff3
6 changed files with 131 additions and 53 deletions

View File

@@ -114,11 +114,13 @@ the maintainer-only release runbook.
- Run the manual `Full Release Validation` workflow before release approval to
kick off all pre-release test boxes from one entrypoint. It accepts a branch,
tag, or full commit SHA, dispatches manual `CI`, and dispatches
`OpenClaw Release Checks` for install smoke, package acceptance, Docker
release-path suites, live/E2E, OpenWebUI, QA Lab parity, Matrix, and Telegram
lanes. With `release_profile=full` and `rerun_group=all`, it also runs package
Telegram E2E against the `release-package-under-test` artifact from release
checks. Provide `npm_telegram_package_spec` after publishing when the same
`OpenClaw Release Checks` for install smoke, package acceptance, cross-OS
package checks, QA Lab parity, Matrix, and Telegram lanes. Stable/default runs
keep exhaustive live/E2E and Docker release-path soak behind
`run_release_soak=true`; `release_profile=full` forces soak on. With
`release_profile=full` and `rerun_group=all`, it also runs package Telegram
E2E against the `release-package-under-test` artifact from release checks.
Provide `npm_telegram_package_spec` after publishing when the same
Telegram E2E should prove the published npm package too. Provide
`package_acceptance_package_spec` after publishing when Package Acceptance
should run its package/update matrix against the shipped npm package instead
@@ -293,8 +295,8 @@ parent `release-package-under-test` artifact for package-facing checks, and
dispatches standalone package Telegram E2E when `release_profile=full` with
`rerun_group=all` or when `npm_telegram_package_spec` is set. `OpenClaw Release
Checks` then fans out install smoke, cross-OS release checks, live/E2E Docker
release-path coverage, Package Acceptance with Telegram package QA, QA Lab
parity, live Matrix, and live Telegram. A full run is only acceptable when the
release-path coverage when soak is enabled, Package Acceptance with Telegram
package QA, QA Lab parity, live Matrix, and live Telegram. A full run is only acceptable when the
`Full Release Validation`
summary shows `normal_ci` and `release_checks` as successful. In full/all mode,
the `npm_telegram` child must also be successful; outside full/all it is skipped
@@ -318,10 +320,15 @@ Use `release_profile` to select live/provider breadth:
- `stable`: minimum plus stable provider/backend coverage for release approval
- `full`: stable plus broad advisory provider/media coverage
Use `run_release_soak=true` with `stable` when the release-blocking lanes are
green and you want the exhaustive live/E2E, Docker release-path, and
all-since-2026.4.23 upgrade-survivor sweep before promotion. `full` implies
`run_release_soak=true`.
`OpenClaw Release Checks` uses the trusted workflow ref to resolve the target
ref once as `release-package-under-test` and reuses that artifact in both
release-path Docker checks and Package Acceptance. This keeps all
package-facing boxes on the same bytes and avoids repeated package builds.
ref once as `release-package-under-test` and reuses that artifact in cross-OS,
Package Acceptance, and release-path Docker checks when soak runs. This keeps
all package-facing boxes on the same bytes and avoids repeated package builds.
The cross-OS OpenAI install smoke uses `OPENCLAW_CROSS_OS_OPENAI_MODEL` when the
repo/org variable is set, otherwise `openai/gpt-5.4`, because this lane is
proving package install, onboarding, gateway startup, and one live agent turn
@@ -474,11 +481,12 @@ Supported candidate sources:
`OpenClaw Release Checks` runs Package Acceptance with `source=artifact`, the
prepared release package artifact, `suite_profile=custom`,
`docker_lanes=doctor-switch update-channel-switch upgrade-survivor published-upgrade-survivor plugins-offline plugin-update`,
`published_upgrade_survivor_baselines=all-since-2026.4.23`,
`published_upgrade_survivor_scenarios=reported-issues`, and
`telegram_mode=mock-openai`. Package Acceptance keeps migration, update, stale
plugin dependency cleanup, offline plugin fixtures, plugin update, and Telegram
package QA against the same resolved tarball. The upgrade matrix covers every stable npm-published baseline from `2026.4.23` through `latest`; use
package QA against the same resolved tarball. Blocking release checks use the
default latest published package baseline; `run_release_soak=true` or
`release_profile=full` expands to every stable npm-published baseline from
`2026.4.23` through `latest` plus reported-issue fixtures. Use
Package Acceptance with `source=npm` for an already shipped candidate, or
`source=ref`/`source=artifact` for a SHA-backed local npm tarball before
publish. It is the GitHub-native
@@ -615,6 +623,9 @@ OpenClaw package must not be published.
- `ref`: branch, tag, or full commit SHA to validate. Secret-bearing checks
require the resolved commit to be reachable from an OpenClaw branch or
release tag.
- `run_release_soak`: opt into exhaustive live/E2E, Docker release-path, and
all-since upgrade-survivor soak on stable/default release checks. It is forced
on by `release_profile=full`.
Rules:

View File

@@ -27,6 +27,11 @@ Child workflows use the trusted workflow ref for the harness and the input
`ref` for the candidate under test. That keeps new validation logic available
when validating an older release branch or tag.
By default, `release_profile=stable` runs the release-blocking lanes and skips
the exhaustive live/Docker soak. Pass `run_release_soak=true` to include the
soak lanes on a stable run. `release_profile=full` always enables soak lanes so
the broad advisory profile never drops coverage silently.
Package Acceptance normally builds the candidate tarball from the resolved
`ref`, including full-SHA runs dispatched with `pnpm ci:full-release`. After
publish, pass `package_acceptance_package_spec=openclaw@YYYY.M.D` (or
@@ -35,15 +40,15 @@ the shipped npm package instead.
## Top-level stages
| Stage | Details |
| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Target resolution | **Job:** `Resolve target ref`<br />**Child workflow:** none<br />**Proves:** resolves the release branch, tag, or full commit SHA and records selected inputs.<br />**Rerun:** rerun the umbrella if this fails. |
| Vitest and normal CI | **Job:** `Run normal full CI`<br />**Child workflow:** `CI`<br />**Proves:** manual full CI graph against the target ref, including Linux Node lanes, bundled plugin shards, channel contracts, Node 22 compatibility, `check`, `check-additional`, build smoke, docs checks, Python skills, Windows, macOS, Control UI i18n, and Android via the umbrella.<br />**Rerun:** `rerun_group=ci`. |
| Plugin prerelease | **Job:** `Run plugin prerelease validation`<br />**Child workflow:** `Plugin Prerelease`<br />**Proves:** release-only plugin static checks, agentic plugin coverage, full extension batch shards, and plugin prerelease Docker lanes.<br />**Rerun:** `rerun_group=plugin-prerelease`. |
| Release checks | **Job:** `Run release/live/Docker/QA validation`<br />**Child workflow:** `OpenClaw Release Checks`<br />**Proves:** install smoke, cross-OS package checks, live/E2E suites, Docker release-path chunks, Package Acceptance, QA Lab parity, live Matrix, and live Telegram.<br />**Rerun:** `rerun_group=release-checks` or a narrower release-checks handle. |
| Package artifact | **Job:** `Prepare release package artifact`<br />**Child workflow:** none<br />**Proves:** creates the parent `release-package-under-test` tarball early enough for package-facing checks that do not need to wait for `OpenClaw Release Checks`.<br />**Rerun:** rerun the umbrella or provide `npm_telegram_package_spec` for `rerun_group=npm-telegram`. |
| Package Telegram | **Job:** `Run package Telegram E2E`<br />**Child workflow:** `NPM Telegram Beta E2E`<br />**Proves:** parent-artifact-backed Telegram package proof for `rerun_group=all` with `release_profile=full`, or published-package Telegram proof when `npm_telegram_package_spec` is set.<br />**Rerun:** `rerun_group=npm-telegram` with `npm_telegram_package_spec`. |
| Umbrella verifier | **Job:** `Verify full validation`<br />**Child workflow:** none<br />**Proves:** re-checks recorded child run conclusions and appends slowest-job tables from child workflows.<br />**Rerun:** rerun only this job after rerunning a failed child to green. |
| Stage | Details |
| -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Target resolution | **Job:** `Resolve target ref`<br />**Child workflow:** none<br />**Proves:** resolves the release branch, tag, or full commit SHA and records selected inputs.<br />**Rerun:** rerun the umbrella if this fails. |
| Vitest and normal CI | **Job:** `Run normal full CI`<br />**Child workflow:** `CI`<br />**Proves:** manual full CI graph against the target ref, including Linux Node lanes, bundled plugin shards, channel contracts, Node 22 compatibility, `check`, `check-additional`, build smoke, docs checks, Python skills, Windows, macOS, Control UI i18n, and Android via the umbrella.<br />**Rerun:** `rerun_group=ci`. |
| Plugin prerelease | **Job:** `Run plugin prerelease validation`<br />**Child workflow:** `Plugin Prerelease`<br />**Proves:** release-only plugin static checks, agentic plugin coverage, full extension batch shards, and plugin prerelease Docker lanes.<br />**Rerun:** `rerun_group=plugin-prerelease`. |
| Release checks | **Job:** `Run release/live/Docker/QA validation`<br />**Child workflow:** `OpenClaw Release Checks`<br />**Proves:** install smoke, cross-OS package checks, Package Acceptance, QA Lab parity, live Matrix, and live Telegram. With `run_release_soak=true` or `release_profile=full`, also runs exhaustive live/E2E suites and Docker release-path chunks.<br />**Rerun:** `rerun_group=release-checks` or a narrower release-checks handle. |
| Package artifact | **Job:** `Prepare release package artifact`<br />**Child workflow:** none<br />**Proves:** creates the parent `release-package-under-test` tarball early enough for package-facing checks that do not need to wait for `OpenClaw Release Checks`.<br />**Rerun:** rerun the umbrella or provide `npm_telegram_package_spec` for `rerun_group=npm-telegram`. |
| Package Telegram | **Job:** `Run package Telegram E2E`<br />**Child workflow:** `NPM Telegram Beta E2E`<br />**Proves:** parent-artifact-backed Telegram package proof for `rerun_group=all` with `release_profile=full`, or published-package Telegram proof when `npm_telegram_package_spec` is set.<br />**Rerun:** `rerun_group=npm-telegram` with `npm_telegram_package_spec`. |
| Umbrella verifier | **Job:** `Verify full validation`<br />**Child workflow:** none<br />**Proves:** re-checks recorded child run conclusions and appends slowest-job tables from child workflows.<br />**Rerun:** rerun only this job after rerunning a failed child to green. |
For `ref=main` and `rerun_group=all`, a newer umbrella supersedes an older one.
When the parent is cancelled, its monitor cancels any child workflow it already
@@ -56,19 +61,19 @@ default.
once and prepares a shared `release-package-under-test` artifact when package
or Docker-facing stages need it.
| Stage | Details |
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Release target | **Job:** `Resolve target ref`<br />**Backing workflow:** none<br />**Tests:** selected ref, optional expected SHA, profile, rerun group, and focused live suite filter.<br />**Rerun:** `rerun_group=release-checks`. |
| Package artifact | **Job:** `Prepare release package artifact`<br />**Backing workflow:** none<br />**Tests:** packs or resolves one candidate tarball and uploads `release-package-under-test` for downstream package-facing checks.<br />**Rerun:** the affected package, cross-OS, or live/E2E group. |
| Install smoke | **Job:** `Run install smoke`<br />**Backing workflow:** `Install Smoke`<br />**Tests:** full install path with root Dockerfile smoke image reuse, QR package install, root and gateway Docker smokes, installer Docker tests, Bun global install image-provider smoke, and fast bundled-plugin install/uninstall E2E.<br />**Rerun:** `rerun_group=install-smoke`. |
| Cross-OS | **Job:** `cross_os_release_checks`<br />**Backing workflow:** `OpenClaw Cross-OS Release Checks (Reusable)`<br />**Tests:** fresh and upgrade lanes on Linux, Windows, and macOS for the selected provider and mode, using the candidate tarball plus a baseline package.<br />**Rerun:** `rerun_group=cross-os`. |
| Repo and live E2E | **Job:** `Run repo/live E2E validation`<br />**Backing workflow:** `OpenClaw Live And E2E Checks (Reusable)`<br />**Tests:** repository E2E, live cache, OpenAI websocket streaming, native live provider and plugin shards, and Docker-backed live model/backend/gateway harnesses selected by `release_profile`.<br />**Rerun:** `rerun_group=live-e2e`, optionally with `live_suite_filter`. |
| Docker release path | **Job:** `Run Docker release-path validation`<br />**Backing workflow:** `OpenClaw Live And E2E Checks (Reusable)`<br />**Tests:** release-path Docker chunks against the shared package artifact.<br />**Rerun:** `rerun_group=live-e2e`. |
| Package Acceptance | **Job:** `Run package acceptance`<br />**Backing workflow:** `Package Acceptance`<br />**Tests:** offline plugin package fixtures, plugin update, mock-OpenAI Telegram package acceptance, and published-upgrade survivor checks from every stable npm release at or after `2026.4.23` against the same tarball.<br />**Rerun:** `rerun_group=package`. |
| QA parity | **Job:** `Run QA Lab parity lane` and `Run QA Lab parity report`<br />**Backing workflow:** direct jobs<br />**Tests:** candidate and baseline agentic parity packs, then the parity report.<br />**Rerun:** `rerun_group=qa-parity` or `rerun_group=qa`. |
| QA live Matrix | **Job:** `Run QA Lab live Matrix lane`<br />**Backing workflow:** direct job<br />**Tests:** fast live Matrix QA profile in the `qa-live-shared` environment.<br />**Rerun:** `rerun_group=qa-live` or `rerun_group=qa`. |
| QA live Telegram | **Job:** `Run QA Lab live Telegram lane`<br />**Backing workflow:** direct job<br />**Tests:** live Telegram QA with Convex CI credential leases.<br />**Rerun:** `rerun_group=qa-live` or `rerun_group=qa`. |
| Release verifier | **Job:** `Verify release checks`<br />**Backing workflow:** none<br />**Tests:** required release-check jobs for the selected rerun group.<br />**Rerun:** rerun after focused child jobs pass. |
| Stage | Details |
| ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Release target | **Job:** `Resolve target ref`<br />**Backing workflow:** none<br />**Tests:** selected ref, optional expected SHA, profile, rerun group, and focused live suite filter.<br />**Rerun:** `rerun_group=release-checks`. |
| Package artifact | **Job:** `Prepare release package artifact`<br />**Backing workflow:** none<br />**Tests:** packs or resolves one candidate tarball and uploads `release-package-under-test` for downstream package-facing checks.<br />**Rerun:** the affected package, cross-OS, or live/E2E group. |
| Install smoke | **Job:** `Run install smoke`<br />**Backing workflow:** `Install Smoke`<br />**Tests:** full install path with root Dockerfile smoke image reuse, QR package install, root and gateway Docker smokes, installer Docker tests, Bun global install image-provider smoke, and fast bundled-plugin install/uninstall E2E.<br />**Rerun:** `rerun_group=install-smoke`. |
| Cross-OS | **Job:** `cross_os_release_checks`<br />**Backing workflow:** `OpenClaw Cross-OS Release Checks (Reusable)`<br />**Tests:** fresh and upgrade lanes on Linux, Windows, and macOS for the selected provider and mode, using the candidate tarball plus a baseline package.<br />**Rerun:** `rerun_group=cross-os`. |
| Repo and live E2E | **Job:** `Run repo/live E2E validation`<br />**Backing workflow:** `OpenClaw Live And E2E Checks (Reusable)`<br />**Tests:** repository E2E, live cache, OpenAI websocket streaming, native live provider and plugin shards, and Docker-backed live model/backend/gateway harnesses selected by `release_profile`.<br />**Runs:** `run_release_soak=true`, `release_profile=full`, or focused `rerun_group=live-e2e`.<br />**Rerun:** `rerun_group=live-e2e`, optionally with `live_suite_filter`. |
| Docker release path | **Job:** `Run Docker release-path validation`<br />**Backing workflow:** `OpenClaw Live And E2E Checks (Reusable)`<br />**Tests:** release-path Docker chunks against the shared package artifact.<br />**Runs:** `run_release_soak=true`, `release_profile=full`, or focused `rerun_group=live-e2e`.<br />**Rerun:** `rerun_group=live-e2e`. |
| Package Acceptance | **Job:** `Run package acceptance`<br />**Backing workflow:** `Package Acceptance`<br />**Tests:** offline plugin package fixtures, plugin update, mock-OpenAI Telegram package acceptance, and published-upgrade survivor checks against the same tarball. Blocking release checks use the default latest published baseline; soak checks expand to every stable npm release at or after `2026.4.23` plus reported-issue fixtures.<br />**Rerun:** `rerun_group=package`. |
| QA parity | **Job:** `Run QA Lab parity lane` and `Run QA Lab parity report`<br />**Backing workflow:** direct jobs<br />**Tests:** candidate and baseline agentic parity packs, then the parity report.<br />**Rerun:** `rerun_group=qa-parity` or `rerun_group=qa`. |
| QA live Matrix | **Job:** `Run QA Lab live Matrix lane`<br />**Backing workflow:** direct job<br />**Tests:** fast live Matrix QA profile in the `qa-live-shared` environment.<br />**Rerun:** `rerun_group=qa-live` or `rerun_group=qa`. |
| QA live Telegram | **Job:** `Run QA Lab live Telegram lane`<br />**Backing workflow:** direct job<br />**Tests:** live Telegram QA with Convex CI credential leases.<br />**Rerun:** `rerun_group=qa-live` or `rerun_group=qa`. |
| Release verifier | **Job:** `Verify release checks`<br />**Backing workflow:** none<br />**Tests:** required release-check jobs for the selected rerun group.<br />**Rerun:** rerun after focused child jobs pass. |
## Docker release-path chunks
@@ -93,10 +98,11 @@ commands with package artifact and image reuse inputs when available.
`release_profile` mostly controls live/provider breadth inside release checks.
It does not remove normal full CI, Plugin Prerelease, install smoke, package
acceptance, QA Lab, or Docker release-path chunks. `full` also makes the
umbrella run package Telegram E2E against the parent release package artifact when
`rerun_group=all`, so a full pre-publish candidate does not silently skip that
Telegram package lane.
acceptance, or QA Lab. For `stable`, exhaustive repo/live E2E and Docker
release-path chunks are soak coverage and run when `run_release_soak=true`.
`full` forces soak coverage on and also makes the umbrella run package Telegram
E2E against the parent release package artifact when `rerun_group=all`, so a full
pre-publish candidate does not silently skip that Telegram package lane.
| Profile | Intended use | Included live/provider coverage |
| --------- | --------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |