docs(testbox): align maintainer testbox mode

This commit is contained in:
Vincent Koc
2026-04-26 21:22:52 -07:00
parent 166a6d9088
commit 9c07579a95
2 changed files with 56 additions and 18 deletions

View File

@@ -16,6 +16,19 @@ warm caches, local build state, and fast feedback.
Testbox is the expensive path. Reach for it deliberately.
OpenClaw maintainers can opt into Testbox-first validation by setting
`OPENCLAW_TESTBOX=1` in their environment or standing agent rules. This mode is
maintainers-only and requires Blacksmith access.
When `OPENCLAW_TESTBOX=1` is set in OpenClaw:
- Pre-warm a Testbox early for longer, wider, or uncertain work.
- Prefer Testbox for `pnpm` gates, e2e, package-like proof, and broad suites.
- Reuse the same Testbox ID for every run command in the same task/session.
- Use local commands only when the task explicitly sets
`OPENCLAW_LOCAL_CHECK_MODE=throttled|full`, or when the user asks for local
proof.
## Install the CLI
If `blacksmith` is not installed, install it:
@@ -81,7 +94,8 @@ Prefer Testbox when:
- you are reproducing CI-only failures
- you need the exact workflow image/job environment from GitHub Actions
For OpenClaw specifically, normal local iteration should stay local:
For OpenClaw specifically, normal local iteration stays local unless maintainer
Testbox mode is enabled with `OPENCLAW_TESTBOX=1`:
- `pnpm check:changed`
- `pnpm test:changed`
@@ -89,9 +103,9 @@ For OpenClaw specifically, normal local iteration should stay local:
- `pnpm test:serial`
- `pnpm build`
Only use Testbox in OpenClaw when the user explicitly wants CI-parity or the
check truly depends on remote secrets/services that the local repo loop cannot
provide.
If `OPENCLAW_TESTBOX=1` is enabled, run those same repo commands inside the
warm Testbox. If the user wants laptop-friendly local proof for one command, use
the explicit escape hatch `OPENCLAW_LOCAL_CHECK_MODE=throttled`.
For installable-package product proof, prefer the GitHub `Package Acceptance`
workflow over an ad hoc Testbox command. It resolves one package candidate
@@ -103,21 +117,35 @@ an older trusted branch, tag, or SHA.
## Setup: Warmup before coding
If you decided Testbox is actually warranted, warm one up early. This returns
an ID instantly and boots the CI environment in the background while you work:
If you decided Testbox is warranted, warm one up early. This returns an ID
instantly and boots the CI environment in the background while you work:
blacksmith testbox warmup ci-check-testbox.yml
# → tbx_01jkz5b3t9...
Save this ID. You need it for every `run` command.
For OpenClaw maintainer Testbox mode, pre-warm at the start of longer or wider
tasks:
blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90
Use the build-artifact warmup when e2e/package/build proof benefits from seeded
`dist/`, `dist-runtime/`, and build-all caches:
blacksmith testbox warmup ci-build-artifacts-testbox.yml --ref main --idle-timeout 90
Warmup dispatches a GitHub Actions workflow that provisions a VM with the
full CI environment: dependencies installed, services started, secrets
injected, and a clean checkout of the repo at the default branch.
In OpenClaw, raw commit SHAs are not reliable dispatch refs for `warmup --ref`;
use a branch or tag. The build-artifact workflow resolves `openclaw@beta` and
`openclaw@latest` to SHA cache keys internally.
Options:
--ref <branch> Git ref to dispatch against (default: repo's default branch)
--ref <branch|tag> Git ref to dispatch against (default: repo's default branch)
--job <name> Specific job within the workflow (if it has multiple)
--idle-timeout <min> Idle timeout in minutes (default: 30)
@@ -234,6 +262,11 @@ services, CI-only runners, or reproducibility against the workflow image.
If the repo says local tests/builds are the normal path, follow the repo.
OpenClaw maintainer exception: if `OPENCLAW_TESTBOX=1` is set by the user or
agent environment, treat Testbox as the normal validation path for this repo.
Use `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` as the explicit local escape
hatch.
## When to use
Use Testbox when:
@@ -250,12 +283,13 @@ checks that need parity or remote state.
## Workflow
1. Decide whether the repo's local loop is the right default.
2. Only if Testbox is warranted, warm up early:
`blacksmith testbox warmup ci-check-testbox.yml` → save the ID
1. Decide whether the repo's local loop is the right default. For OpenClaw,
`OPENCLAW_TESTBOX=1` makes Testbox the maintainer default.
2. If Testbox is warranted, warm up early:
`blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90` → save the ID
3. Write code while the testbox boots in the background.
4. Run the remote command when needed:
`blacksmith testbox run --id <ID> "npm test"`
`blacksmith testbox run --id <ID> "pnpm check:changed"`
5. If tests fail, fix code and re-run against the same warm box.
6. If you changed dependency manifests (package.json, etc.), prepend
the install command: `blacksmith testbox run --id <ID> "npm install && npm test"`
@@ -276,9 +310,9 @@ Observed full-suite time on Blacksmith Testbox is about 3-4 minutes:
- 173-180s on a warmed box
- 219s on a fresh 32-vCPU box
When validating before commit/push, run `pnpm check:changed` first when
appropriate, then the full suite with the profile above if broad confidence is
needed.
When validating before commit/push in maintainer Testbox mode, run
`pnpm check:changed` inside the warmed box first when appropriate, then the full
suite with the profile above if broad confidence is needed.
## Examples
@@ -332,12 +366,14 @@ timeout is reached). Default timeout is 5m; use `--wait-timeout` for longer
blacksmith testbox stop --id <ID>
Testboxes automatically shut down after being idle (default: 30 minutes).
If you need a longer session, increase the timeout at warmup time:
If you need a longer session, increase the timeout at warmup time. For OpenClaw
maintainer work, use 90 minutes for long-running sessions:
blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 60
blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90
blacksmith testbox warmup ci-build-artifacts-testbox.yml --idle-timeout 90
## With options
blacksmith testbox warmup ci-check-testbox.yml --ref main
blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 60
blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90
blacksmith testbox run --id <ID> "go test ./..."

View File

@@ -54,7 +54,9 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work.
- Formatting: use `oxfmt`, not Prettier. Prefer `pnpm format:check` / `pnpm format`; for targeted files use `pnpm exec oxfmt --check --threads=1 <files...>` or `pnpm exec oxfmt --write --threads=1 <files...>`.
- Linting: use repo wrappers (`pnpm lint:*`, `scripts/run-oxlint.mjs`); do not invoke generic JS formatters/lints unless a repo script uses them.
- Heavy checks: `OPENCLAW_LOCAL_CHECK=1`, mode `OPENCLAW_LOCAL_CHECK_MODE=throttled|full`; CI/shared use `OPENCLAW_LOCAL_CHECK=0`.
- Local first. Use repo `pnpm` lanes before Blacksmith/Testbox. Remote only for parity-only failures, secrets/services, or explicit ask.
- Maintainer Testbox mode: if `OPENCLAW_TESTBOX=1` is present in env or standing user rules, use Blacksmith Testbox for `pnpm` gates, e2e, broad suites, and long/heavy validation. This is maintainers-only and requires Blacksmith access.
- Testbox escape hatch: if `OPENCLAW_TESTBOX=1` is set but `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` is explicitly set for the task/command, use the local repo `pnpm` lane instead.
- Testbox warmup: start from repo root, save/reuse the returned ID for every run in the same task. Use `ci-check-testbox.yml` for normal checks; use `ci-build-artifacts-testbox.yml` when build artifacts, e2e, or package-like proof benefits from seeded `dist/`/`dist-runtime/` caches.
## GitHub / CI