diff --git a/.agents/skills/blacksmith-testbox/SKILL.md b/.agents/skills/blacksmith-testbox/SKILL.md index ef53f45c78b..af3d3159565 100644 --- a/.agents/skills/blacksmith-testbox/SKILL.md +++ b/.agents/skills/blacksmith-testbox/SKILL.md @@ -16,6 +16,19 @@ warm caches, local build state, and fast feedback. Testbox is the expensive path. Reach for it deliberately. +OpenClaw maintainers can opt into Testbox-first validation by setting +`OPENCLAW_TESTBOX=1` in their environment or standing agent rules. This mode is +maintainers-only and requires Blacksmith access. + +When `OPENCLAW_TESTBOX=1` is set in OpenClaw: + +- Pre-warm a Testbox early for longer, wider, or uncertain work. +- Prefer Testbox for `pnpm` gates, e2e, package-like proof, and broad suites. +- Reuse the same Testbox ID for every run command in the same task/session. +- Use local commands only when the task explicitly sets + `OPENCLAW_LOCAL_CHECK_MODE=throttled|full`, or when the user asks for local + proof. + ## Install the CLI If `blacksmith` is not installed, install it: @@ -81,7 +94,8 @@ Prefer Testbox when: - you are reproducing CI-only failures - you need the exact workflow image/job environment from GitHub Actions -For OpenClaw specifically, normal local iteration should stay local: +For OpenClaw specifically, normal local iteration stays local unless maintainer +Testbox mode is enabled with `OPENCLAW_TESTBOX=1`: - `pnpm check:changed` - `pnpm test:changed` @@ -89,9 +103,9 @@ For OpenClaw specifically, normal local iteration should stay local: - `pnpm test:serial` - `pnpm build` -Only use Testbox in OpenClaw when the user explicitly wants CI-parity or the -check truly depends on remote secrets/services that the local repo loop cannot -provide. +If `OPENCLAW_TESTBOX=1` is enabled, run those same repo commands inside the +warm Testbox. If the user wants laptop-friendly local proof for one command, use +the explicit escape hatch `OPENCLAW_LOCAL_CHECK_MODE=throttled`. For installable-package product proof, prefer the GitHub `Package Acceptance` workflow over an ad hoc Testbox command. It resolves one package candidate @@ -103,21 +117,35 @@ an older trusted branch, tag, or SHA. ## Setup: Warmup before coding -If you decided Testbox is actually warranted, warm one up early. This returns -an ID instantly and boots the CI environment in the background while you work: +If you decided Testbox is warranted, warm one up early. This returns an ID +instantly and boots the CI environment in the background while you work: blacksmith testbox warmup ci-check-testbox.yml # → tbx_01jkz5b3t9... Save this ID. You need it for every `run` command. +For OpenClaw maintainer Testbox mode, pre-warm at the start of longer or wider +tasks: + + blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90 + +Use the build-artifact warmup when e2e/package/build proof benefits from seeded +`dist/`, `dist-runtime/`, and build-all caches: + + blacksmith testbox warmup ci-build-artifacts-testbox.yml --ref main --idle-timeout 90 + Warmup dispatches a GitHub Actions workflow that provisions a VM with the full CI environment: dependencies installed, services started, secrets injected, and a clean checkout of the repo at the default branch. +In OpenClaw, raw commit SHAs are not reliable dispatch refs for `warmup --ref`; +use a branch or tag. The build-artifact workflow resolves `openclaw@beta` and +`openclaw@latest` to SHA cache keys internally. + Options: - --ref Git ref to dispatch against (default: repo's default branch) + --ref Git ref to dispatch against (default: repo's default branch) --job Specific job within the workflow (if it has multiple) --idle-timeout Idle timeout in minutes (default: 30) @@ -234,6 +262,11 @@ services, CI-only runners, or reproducibility against the workflow image. If the repo says local tests/builds are the normal path, follow the repo. +OpenClaw maintainer exception: if `OPENCLAW_TESTBOX=1` is set by the user or +agent environment, treat Testbox as the normal validation path for this repo. +Use `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` as the explicit local escape +hatch. + ## When to use Use Testbox when: @@ -250,12 +283,13 @@ checks that need parity or remote state. ## Workflow -1. Decide whether the repo's local loop is the right default. -2. Only if Testbox is warranted, warm up early: - `blacksmith testbox warmup ci-check-testbox.yml` → save the ID +1. Decide whether the repo's local loop is the right default. For OpenClaw, + `OPENCLAW_TESTBOX=1` makes Testbox the maintainer default. +2. If Testbox is warranted, warm up early: + `blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90` → save the ID 3. Write code while the testbox boots in the background. 4. Run the remote command when needed: - `blacksmith testbox run --id "npm test"` + `blacksmith testbox run --id "pnpm check:changed"` 5. If tests fail, fix code and re-run against the same warm box. 6. If you changed dependency manifests (package.json, etc.), prepend the install command: `blacksmith testbox run --id "npm install && npm test"` @@ -276,9 +310,9 @@ Observed full-suite time on Blacksmith Testbox is about 3-4 minutes: - 173-180s on a warmed box - 219s on a fresh 32-vCPU box -When validating before commit/push, run `pnpm check:changed` first when -appropriate, then the full suite with the profile above if broad confidence is -needed. +When validating before commit/push in maintainer Testbox mode, run +`pnpm check:changed` inside the warmed box first when appropriate, then the full +suite with the profile above if broad confidence is needed. ## Examples @@ -332,12 +366,14 @@ timeout is reached). Default timeout is 5m; use `--wait-timeout` for longer blacksmith testbox stop --id Testboxes automatically shut down after being idle (default: 30 minutes). -If you need a longer session, increase the timeout at warmup time: +If you need a longer session, increase the timeout at warmup time. For OpenClaw +maintainer work, use 90 minutes for long-running sessions: - blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 60 + blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90 + blacksmith testbox warmup ci-build-artifacts-testbox.yml --idle-timeout 90 ## With options blacksmith testbox warmup ci-check-testbox.yml --ref main - blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 60 + blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90 blacksmith testbox run --id "go test ./..." diff --git a/AGENTS.md b/AGENTS.md index faca52035ae..5eb1b28406a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -54,7 +54,9 @@ Telegraph style. Root rules only. Read scoped `AGENTS.md` before subtree work. - Formatting: use `oxfmt`, not Prettier. Prefer `pnpm format:check` / `pnpm format`; for targeted files use `pnpm exec oxfmt --check --threads=1 ` or `pnpm exec oxfmt --write --threads=1 `. - Linting: use repo wrappers (`pnpm lint:*`, `scripts/run-oxlint.mjs`); do not invoke generic JS formatters/lints unless a repo script uses them. - Heavy checks: `OPENCLAW_LOCAL_CHECK=1`, mode `OPENCLAW_LOCAL_CHECK_MODE=throttled|full`; CI/shared use `OPENCLAW_LOCAL_CHECK=0`. -- Local first. Use repo `pnpm` lanes before Blacksmith/Testbox. Remote only for parity-only failures, secrets/services, or explicit ask. +- Maintainer Testbox mode: if `OPENCLAW_TESTBOX=1` is present in env or standing user rules, use Blacksmith Testbox for `pnpm` gates, e2e, broad suites, and long/heavy validation. This is maintainers-only and requires Blacksmith access. +- Testbox escape hatch: if `OPENCLAW_TESTBOX=1` is set but `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` is explicitly set for the task/command, use the local repo `pnpm` lane instead. +- Testbox warmup: start from repo root, save/reuse the returned ID for every run in the same task. Use `ci-check-testbox.yml` for normal checks; use `ci-build-artifacts-testbox.yml` when build artifacts, e2e, or package-like proof benefits from seeded `dist/`/`dist-runtime/` caches. ## GitHub / CI