* refactor: remove stale file-backed shims * fix: harden sqlite state ci boundaries * refactor: store matrix idb snapshots in sqlite * fix: satisfy rebased CI guardrails * refactor: store current conversation bindings in sqlite table * refactor: store tui last sessions in sqlite table * refactor: reset sqlite schema history * refactor: drop unshipped sqlite table migration * refactor: remove plugin index file rollback * refactor: drop unshipped sqlite sidecar migrations * refactor: remove runtime commitments kv migration * refactor: preserve kysely sync result types * refactor: drop unshipped sqlite schema migration table * test: keep session usage coverage sqlite-backed * refactor: keep sqlite migration doctor-only * refactor: isolate device legacy imports * refactor: isolate push voicewake legacy imports * refactor: isolate remaining runtime legacy imports * refactor: tighten sqlite migration guardrails * test: cover sqlite persisted enum parsing * refactor: isolate legacy update and tui imports * refactor: tighten sqlite state ownership * refactor: move legacy imports behind doctor * refactor: remove legacy session row lookup * refactor: canonicalize memory transcript locators * refactor: drop transcript path scope fallbacks * refactor: drop runtime legacy session delivery pruning * refactor: store tts prefs only in sqlite * refactor: remove cron store path runtime * refactor: use cron sqlite store keys * refactor: rename telegram message cache scope * refactor: read memory dreaming status from sqlite * refactor: rename cron status store key * refactor: stop remembering transcript file paths * test: use sqlite locators in agent fixtures * refactor: remove file-shaped commitments and cron store surfaces * refactor: keep compaction transcript handles out of session rows * refactor: derive transcript handles from session identity * refactor: derive runtime transcript handles * refactor: remove gateway session locator reads * refactor: remove transcript locator from session rows * refactor: store raw stream diagnostics in sqlite * refactor: remove file-shaped transcript rotation * refactor: hide legacy trajectory paths from runtime * refactor: remove runtime transcript file bridges * refactor: repair database-first rebase fallout * refactor: align tests with database-first state * refactor: remove transcript file handoffs * refactor: sync post-compaction memory by transcript scope * refactor: run codex app-server sessions by id * refactor: bind codex runtime state by session id * refactor: pass memory transcripts by sqlite scope * refactor: remove transcript locator cleanup leftovers * test: remove stale transcript file fixtures * refactor: remove transcript locator test helper * test: make cron sqlite keys explicit * test: remove cron runtime store paths * test: remove stale session file fixtures * test: use sqlite cron keys in diagnostics * refactor: remove runtime delivery queue backfill * test: drop fake export session file mocks * refactor: rename acp session read failure flag * refactor: rename acp row session key * refactor: remove session store test seams * refactor: move legacy session parser tests to doctor * refactor: reindex managed memory in place * refactor: drop stale session store wording * refactor: rename session row helpers * refactor: rename sqlite session entry modules * refactor: remove transcript locator leftovers * refactor: trim file-era audit wording * refactor: clean managed media through sqlite * fix: prefer explicit agent for exports * fix: use prepared agent for session resets * fix: canonicalize legacy codex binding import * test: rename state cleanup helper * docs: align backup docs with sqlite state * refactor: drop legacy Pi usage auth fallback * refactor: move legacy auth profile imports to doctor * refactor: keep Pi model discovery auth in memory * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime * refactor: remove model json compatibility aliases * refactor: store auth profiles in sqlite * refactor: seed copied auth profiles in sqlite * refactor: make auth profile runtime sqlite-addressed * refactor: migrate hermes secrets into sqlite auth store * refactor: move plugin install config migration to doctor * refactor: rename plugin index audit checks * test: drop auth file assumptions * test: remove legacy transcript file assertions * refactor: drop legacy cli session aliases * refactor: store skill uploads in sqlite * refactor: keep subagent attachments in sqlite vfs * refactor: drop subagent attachment cleanup state * refactor: move legacy session aliases to doctor * refactor: require node 24 for sqlite state runtime * refactor: move provider caches into sqlite state * fix: harden virtual agent filesystem * refactor: enforce database-first runtime state * refactor: rename compaction transcript rotation setting * test: clean sqlite refactor test types * refactor: consolidate sqlite runtime state * refactor: model session conversations in sqlite * refactor: stop deriving cron delivery from session keys * refactor: stop classifying sessions from key shape * refactor: hydrate announce targets from typed delivery * refactor: route heartbeat delivery from typed sqlite context * refactor: tighten typed sqlite session routing * refactor: remove session origin routing shadow * refactor: drop session origin shadow fixtures * perf: query sqlite vfs paths by prefix * refactor: use typed conversation metadata for sessions * refactor: prefer typed session routing metadata * refactor: require typed session routing metadata * refactor: resolve group tool policy from typed sessions * refactor: delete dead session thread info bridge * Show Codex subscription reset times in channel errors (#80456) * feat(plugin-sdk): consolidate session workflow APIs * fix(agents): allow read-only agent mount reads * [codex] refresh plugin regression fixtures * fix(agents): restore compaction gateway logs * test: tighten gateway startup assertions * Redact persisted secret-shaped payloads [AI] (#79006) * test: tighten device pair notify assertions * test: tighten hermes secret assertions * test: assert matrix client error shapes * test: assert config compat warnings * fix(heartbeat): remap cron-run exec events to session keys (#80214) * fix(codex): route btw through native side threads * fix(auth): accept friendly OpenAI order for Codex profiles * fix(codex): rotate auth profiles inside harness * fix: keep browser status page probe within timeout * test: assert agents add outputs * test: pin cron read status * fix(agents): avoid Pi resource discovery stalls Co-authored-by: dataCenter430 <titan032000@gmail.com> * fix: retire timed-out codex app-server clients * test: tighten qa lab runtime assertions * test: check security fix outputs * test: verify extension runtime messages * feat(wake): expose typed sessionKey on wake protocol + system event CLI * fix(gateway): await session_end during shutdown drain and track channel + compaction lifecycle paths (#57790) * test: guard talk consult call helper * fix(codex): scale context engine projection (#80761) * fix(codex): scale context engine projection * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * fix: document Codex context projection scaling * chore: align Codex projection changelog * chore: realign Codex projection changelog * fix: isolate Codex projection patch --------- Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> * refactor: move agent runtime state toward piless * refactor: remove cron session reaper * refactor: move session management to sqlite * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: remove stale file-backed shims * test: harden kysely type coverage # Conflicts: # .agents/skills/kysely-database-access/SKILL.md # src/infra/kysely-sync.types.test.ts # src/proxy-capture/store.sqlite.test.ts # src/state/openclaw-agent-db.test.ts # src/state/openclaw-state-db.test.ts * refactor: remove cron store path runtime * refactor: keep compaction transcript handles out of session rows * refactor: derive embedded transcripts from sqlite identity * refactor: remove embedded transcript locator handoff * refactor: remove runtime transcript file bridges * refactor: remove transcript file handoffs * refactor: remove MSTeams legacy learning key fallback * refactor: store model catalog config in sqlite * refactor: use sqlite model catalog at runtime # Conflicts: # docs/cli/secrets.md # docs/gateway/authentication.md # docs/gateway/secrets.md * fix: keep oauth sibling sync sqlite-local # Conflicts: # src/commands/onboard-auth.test.ts * refactor: remove task session store maintenance # Conflicts: # src/commands/tasks.ts * refactor: keep diagnostics in state sqlite * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * Show Codex subscription reset times in channel errors (#80456) * fix(codex): refresh subscription limit resets * fix(codex): format reset times for channels * Update CHANGELOG with latest changes and fixes Updated CHANGELOG with recent fixes and improvements. * fix(codex): keep command load failures on codex surface * fix(codex): format account rate limits as rows * fix(codex): summarize account limits as usage status * fix(codex): simplify account limit status * test: tighten subagent announce queue assertion * test: tighten session delete lifecycle assertions * test: tighten cron ops assertions * fix: track cron execution milestones * test: tighten hermes secret assertions * test: assert matrix sync store payloads * test: assert config compat warnings * fix(codex): align btw side thread semantics * fix(codex): honor codex fallback blocking * fix(agents): avoid Pi resource discovery stalls * test: tighten codex event assertions * test: tighten cron assertions * Fix Codex app-server OAuth harness auth * refactor: move agent runtime state toward piless * refactor: move device and push state to sqlite * refactor: move runtime json state imports to doctor * refactor: finish database-first state migration * chore: refresh generated sqlite db types * refactor: clarify cron sqlite store keys * refactor: remove stale file-backed shims * refactor: bind codex runtime state by session id * test: expect sqlite trajectory branch export * refactor: rename session row helpers * fix: keep legacy device identity import in doctor * refactor: enforce database-first runtime state * refactor: consolidate sqlite runtime state * build: align pi contract wrappers * chore: repair database-first rebase * refactor: remove session file test contracts * test: update gateway session expectations * refactor: stop routing from session compatibility shadows * refactor: stop persisting session route shadows * refactor: use typed delivery context in clients * refactor: stop echoing session route shadows * refactor: repair embedded runner rebase imports # Conflicts: # src/agents/pi-embedded-runner/run/attempt.tool-call-argument-repair.ts * refactor: align pi contract imports * refactor: satisfy kysely sync helper guard * refactor: remove file transcript bridge remnants * refactor: remove session locator compatibility * refactor: remove session file test contracts * refactor: keep rebase database-first clean * refactor: remove session file assumptions from e2e * docs: clarify database-first goal state * test: remove legacy store markers from sqlite runtime tests * refactor: remove legacy store assumptions from runtime seams * refactor: align sqlite runtime helper seams * test: update memory recall sqlite audit mock * refactor: align database-first runtime type seams * test: clarify doctor cron legacy store names * fix: preserve sqlite session route projections * test: fix copilot token cache test syntax * docs: update database-first proof status * test: align database-first test fixtures * docs: update database-first proof status * refactor: clean extension database-first drift * test: align agent session route proof * test: clarify doctor legacy path fixtures * chore: clean database-first changed checks * chore: repair database-first rebase markers * build: allow baileys git subdependency * chore: repair exp-vfs rebase drift * chore: finish exp-vfs rebase cleanup * chore: satisfy rebase lint drift * chore: fix qqbot rebase type seam * chore: fix rebase drift leftovers * fix: keep auth profile oauth secrets out of sqlite * fix: repair rebase drift tests * test: stabilize pairing request ordering * test: use source manifests in plugin contract checks * fix: restore gateway session metadata after rebase * fix: repair database-first rebase drift * fix: clean up database-first rebase fallout * test: stabilize line quick reply receipt time * fix: repair extension rebase drift * test: keep transcript redaction tests sqlite-backed * fix: carry injected transcript redaction through sqlite * chore: clean database branch rebase residue * fix: repair database branch CI drift * fix: repair database branch CI guard drift * fix: stabilize oauth tls preflight test * test: align database branch fast guards * test: repair build artifact boundary guards * chore: clean changelog rebase markers --------- Co-authored-by: pashpashpash <nik@vault77.ai> Co-authored-by: Eva <eva@100yen.org> Co-authored-by: stainlu <stainlu@newtype-ai.org> Co-authored-by: Jason Zhou <jason.zhou.design@gmail.com> Co-authored-by: Ruben Cuevas <hi@rubencu.com> Co-authored-by: Pavan Kumar Gondhi <pavangondhi@gmail.com> Co-authored-by: Shakker <shakkerdroid@gmail.com> Co-authored-by: Kaspre <36520309+Kaspre@users.noreply.github.com> Co-authored-by: dataCenter430 <titan032000@gmail.com> Co-authored-by: Kaspre <kaspre@gmail.com> Co-authored-by: pandadev66 <nova.full.stack@outlook.com> Co-authored-by: Eva <admin@100yen.org> Co-authored-by: Eva (agent) <eva+agent-78055@100yen.org> Co-authored-by: Josh Lehman <josh@martian.engineering> Co-authored-by: jeffjhunter <support@aipersonamethod.com>
22 KiB
summary, title, read_when
| summary | title | read_when | |||
|---|---|---|---|---|---|
| Full Release Validation stages, child workflows, release profiles, rerun handles, and evidence | Full release validation |
|
Full Release Validation is the release umbrella. It is the single manual
entrypoint for pre-release proof, but most work happens in child workflows so a
failed box can be rerun without restarting the whole release.
Run it from a trusted workflow ref, normally main, and pass the release branch,
tag, or full commit SHA as ref:
gh workflow run full-release-validation.yml \
--ref main \
-f ref=release/YYYY.M.D \
-f provider=openai \
-f mode=both \
-f release_profile=stable
Child workflows use the trusted workflow ref for the harness and the input
ref for the candidate under test. That keeps new validation logic available
when validating an older release branch or tag.
By default, release_profile=stable runs the release-blocking lanes and skips
the exhaustive live/Docker soak. Pass run_release_soak=true to include the
soak lanes on a stable run. release_profile=full always enables soak lanes so
the broad advisory profile never drops coverage silently.
Package Acceptance normally builds the candidate tarball from the resolved
ref, including full-SHA runs dispatched with pnpm ci:full-release. After a
beta publish, pass release_package_spec=openclaw@YYYY.M.D-beta.N to reuse the
shipped npm package across release checks, Package Acceptance, cross-OS,
release-path Docker, and package Telegram. Use package_acceptance_package_spec
only when Package Acceptance should intentionally prove a different package.
Top-level stages
| Stage | Details |
|---|---|
| Target resolution | Job: Resolve target refChild workflow: none Proves: resolves the release branch, tag, or full commit SHA and records selected inputs. Rerun: rerun the umbrella if this fails. |
| Vitest and normal CI | Job: Run normal full CIChild workflow: CIProves: manual full CI graph against the target ref, including Linux Node lanes, bundled plugin shards, channel contracts, check, check-additional, build smoke, docs checks, Python skills, Windows, macOS, Control UI i18n, and Android via the umbrella.Rerun: rerun_group=ci. |
| Plugin prerelease | Job: Run plugin prerelease validationChild workflow: Plugin PrereleaseProves: release-only plugin static checks, agentic plugin coverage, full extension batch shards, and plugin prerelease Docker lanes. Rerun: rerun_group=plugin-prerelease. |
| Release checks | Job: Run release/live/Docker/QA validationChild workflow: OpenClaw Release ChecksProves: install smoke, cross-OS package checks, Package Acceptance, QA Lab parity, live Matrix, and live Telegram. With run_release_soak=true or release_profile=full, also runs exhaustive live/E2E suites and Docker release-path chunks.Rerun: rerun_group=release-checks or a narrower release-checks handle. |
| Package artifact | Job: Prepare release package artifactChild workflow: none Proves: creates the parent release-package-under-test tarball early enough for package-facing checks that do not need to wait for OpenClaw Release Checks.Rerun: rerun the umbrella or provide release_package_spec for published-package reruns. |
| Package Telegram | Job: Run package Telegram E2EChild workflow: NPM Telegram Beta E2EProves: parent-artifact-backed Telegram package proof for rerun_group=all with release_profile=full, or published-package Telegram proof when release_package_spec or npm_telegram_package_spec is set.Rerun: rerun_group=npm-telegram with release_package_spec or npm_telegram_package_spec. |
| Umbrella verifier | Job: Verify full validationChild workflow: none Proves: re-checks recorded child run conclusions and appends slowest-job tables from child workflows. Rerun: rerun only this job after rerunning a failed child to green. |
For ref=main and rerun_group=all, a newer umbrella supersedes an older one.
When the parent is cancelled, its monitor cancels any child workflow it already
dispatched. Release branch and tag validation runs do not cancel each other by
default.
Release checks stages
OpenClaw Release Checks is the largest child workflow. It resolves the target
once and prepares a shared release-package-under-test artifact when package
or Docker-facing stages need it.
| Stage | Details |
|---|---|
| Release target | Job: Resolve target refBacking workflow: none Tests: selected ref, optional expected SHA, profile, rerun group, and focused live suite filter. Rerun: rerun_group=release-checks. |
| Package artifact | Job: Prepare release package artifactBacking workflow: none Tests: packs or resolves one candidate tarball and uploads release-package-under-test for downstream package-facing checks.Rerun: the affected package, cross-OS, or live/E2E group. |
| Install smoke | Job: Run install smokeBacking workflow: Install SmokeTests: full install path with root Dockerfile smoke image reuse, QR package install, root and gateway Docker smokes, installer Docker tests, Bun global install image-provider smoke, and fast bundled-plugin install/uninstall E2E. Rerun: rerun_group=install-smoke. |
| Cross-OS | Job: cross_os_release_checksBacking workflow: OpenClaw Cross-OS Release Checks (Reusable)Tests: fresh and upgrade lanes on Linux, Windows, and macOS for the selected provider and mode, using the candidate tarball plus a baseline package. Rerun: rerun_group=cross-os. |
| Repo and live E2E | Job: Run repo/live E2E validationBacking workflow: OpenClaw Live And E2E Checks (Reusable)Tests: repository E2E, live cache, OpenAI websocket streaming, native live provider and plugin shards, and Docker-backed live model/backend/gateway harnesses selected by release_profile.Runs: run_release_soak=true, release_profile=full, or focused rerun_group=live-e2e.Rerun: rerun_group=live-e2e, optionally with live_suite_filter. |
| Docker release path | Job: Run Docker release-path validationBacking workflow: OpenClaw Live And E2E Checks (Reusable)Tests: release-path Docker chunks against the shared package artifact. Runs: run_release_soak=true, release_profile=full, or focused rerun_group=live-e2e.Rerun: rerun_group=live-e2e. |
| Package Acceptance | Job: Run package acceptanceBacking workflow: Package AcceptanceTests: offline plugin package fixtures, plugin update, mock-OpenAI Telegram package acceptance, and published-upgrade survivor checks against the same tarball. Blocking release checks use the default latest published baseline; soak checks expand to every stable npm release at or after 2026.4.23 plus reported-issue fixtures.Rerun: rerun_group=package. |
| QA parity | Job: Run QA Lab parity lane and Run QA Lab parity reportBacking workflow: direct jobs Tests: candidate and baseline agentic parity packs, then the parity report. Rerun: rerun_group=qa-parity or rerun_group=qa. |
| QA live Matrix | Job: Run QA Lab live Matrix laneBacking workflow: direct job Tests: fast live Matrix QA profile in the qa-live-shared environment.Rerun: rerun_group=qa-live or rerun_group=qa. |
| QA live Telegram | Job: Run QA Lab live Telegram laneBacking workflow: direct job Tests: live Telegram QA with Convex CI credential leases. Rerun: rerun_group=qa-live or rerun_group=qa. |
| Release verifier | Job: Verify release checksBacking workflow: none Tests: required release-check jobs for the selected rerun group. Rerun: rerun after focused child jobs pass. |
Docker release-path chunks
The Docker release-path stage runs these chunks when live_suite_filter is
empty:
| Chunk | Coverage |
|---|---|
core |
Core Docker release-path smoke lanes. |
package-update-openai |
OpenAI package install/update behavior, Codex on-demand install, and Chat Completions tool calls. |
package-update-anthropic |
Anthropic package install and update behavior. |
package-update-core |
Provider-neutral package and update behavior. |
plugins-runtime-plugins |
Plugin runtime lanes that exercise plugin behavior. |
plugins-runtime-services |
Service-backed and live plugin runtime lanes; includes OpenWebUI when requested. |
plugins-runtime-install-a through plugins-runtime-install-h |
Plugin install/runtime batches split for parallel release validation. |
Use targeted docker_lanes=<lane[,lane]> on the reusable live/E2E workflow when
only one Docker lane failed. The release artifacts include per-lane rerun
commands with package artifact and image reuse inputs when available.
Release profiles
release_profile mostly controls live/provider breadth inside release checks.
It does not remove normal full CI, Plugin Prerelease, install smoke, package
acceptance, or QA Lab. For stable, exhaustive repo/live E2E and Docker
release-path chunks are soak coverage and run when run_release_soak=true.
full forces soak coverage on and also makes the umbrella run package Telegram
E2E against the parent release package artifact when rerun_group=all, so a full
pre-publish candidate does not silently skip that Telegram package lane.
| Profile | Intended use | Included live/provider coverage |
|---|---|---|
minimum |
Fastest release-critical smoke. | OpenAI/core live path, Docker live models for OpenAI, native gateway core, native OpenAI gateway profile, native OpenAI plugin, and Docker live gateway OpenAI. |
stable |
Default release approval profile. | minimum plus Anthropic smoke, Google, MiniMax, backend, native live test harness, Docker live CLI backend, Docker ACP bind, Docker Codex harness, and an OpenCode Go smoke shard. |
full |
Broad advisory sweep. | stable plus advisory providers, plugin live shards, and media live shards. |
Full-only additions
These suites are skipped by stable and included by full:
| Area | Full-only coverage |
|---|---|
| Docker live models | OpenCode Go, OpenRouter, xAI, Z.ai, and Fireworks. |
| Docker live gateway | Advisory providers split into DeepSeek/Fireworks, OpenCode Go/OpenRouter, and xAI/Z.ai shards. |
| Native gateway provider profiles | Full Anthropic Opus and Sonnet/Haiku shards, Fireworks, DeepSeek, full OpenCode Go model shards, OpenRouter, xAI, and Z.ai. |
| Native plugin live shards | Plugins A-K, L-N, O-Z other, Moonshot, and xAI. |
| Native media live shards | Audio, Google music, MiniMax music, and video groups A-D. |
stable includes native-live-src-gateway-profiles-anthropic-smoke and
native-live-src-gateway-profiles-opencode-go-smoke; full uses the broader
Anthropic and OpenCode Go model shards instead. Focused reruns can still use the
aggregate native-live-src-gateway-profiles-anthropic or
native-live-src-gateway-profiles-opencode-go handles.
Focused reruns
Use rerun_group to avoid repeating unrelated release boxes:
| Handle | Scope |
|---|---|
all |
All Full Release Validation stages. |
ci |
Manual full CI child only. |
plugin-prerelease |
Plugin Prerelease child only. |
release-checks |
All OpenClaw Release Checks stages. |
install-smoke |
Install Smoke through release checks. |
cross-os |
Cross-OS release checks. |
live-e2e |
Repo/live E2E and Docker release-path validation. |
package |
Package Acceptance. |
qa |
QA parity plus QA live lanes. |
qa-parity |
QA parity lanes and report only. |
qa-live |
QA live Matrix and Telegram only. |
npm-telegram |
Published-package Telegram E2E; requires release_package_spec or npm_telegram_package_spec. |
Use live_suite_filter with rerun_group=live-e2e when one live suite failed.
Valid filter ids are defined in the reusable live/E2E workflow, including
docker-live-models, live-gateway-docker,
live-gateway-anthropic-docker, live-gateway-google-docker,
live-gateway-minimax-docker, live-gateway-advisory-docker,
live-cli-backend-docker, live-acp-bind-docker, and
live-codex-harness-docker.
The live-gateway-advisory-docker handle is an aggregate rerun handle for its
three provider shards, so it still fans out to all advisory Docker gateway jobs.
Use cross_os_suite_filter with rerun_group=cross-os when one cross-OS lane
failed. The filter accepts an OS id, a suite id, or an OS/suite pair, for
example windows/packaged-upgrade, windows, or packaged-fresh. Cross-OS
summaries include per-phase timings for packaged upgrade lanes, and long-running
commands print heartbeat lines so a stuck Windows update is visible before the
job timeout.
QA release-check lanes are advisory. A QA-only failure is reported as a warning
and does not block the release-check verifier; rerun rerun_group=qa,
qa-parity, or qa-live when you need fresh QA evidence.
Evidence to keep
Keep the Full Release Validation summary as the release-level index. It links
child run ids and includes slowest-job tables. For failures, inspect the child
workflow first, then rerun the smallest matching handle above.
Useful artifacts:
release-package-under-testfrom the Full Release Validation parent andOpenClaw Release Checks- Docker release-path artifacts under
.artifacts/docker-tests/ - Package Acceptance
package-under-testand Docker acceptance artifacts - Cross-OS release-check artifacts for each OS and suite
- QA parity, Matrix, and Telegram artifacts
Workflow files
.github/workflows/full-release-validation.yml.github/workflows/openclaw-release-checks.yml.github/workflows/openclaw-live-and-e2e-checks-reusable.yml.github/workflows/plugin-prerelease.yml.github/workflows/install-smoke.yml.github/workflows/openclaw-cross-os-release-checks-reusable.yml.github/workflows/package-acceptance.yml