From 7c13a48e4939bd1d5cf662a3a89323e766f8d2a8 Mon Sep 17 00:00:00 2001 From: Vincent Koc Date: Tue, 21 Apr 2026 22:36:25 -0700 Subject: [PATCH] chore(agents): prefer local validation over testbox --- .agents/skills/blacksmith-testbox/SKILL.md | 101 +++++++++++++++------ AGENTS.md | 2 + 2 files changed, 77 insertions(+), 26 deletions(-) diff --git a/.agents/skills/blacksmith-testbox/SKILL.md b/.agents/skills/blacksmith-testbox/SKILL.md index 225b8fe50fc..5ef14346307 100644 --- a/.agents/skills/blacksmith-testbox/SKILL.md +++ b/.agents/skills/blacksmith-testbox/SKILL.md @@ -1,14 +1,26 @@ --- name: blacksmith-testbox description: > - Validate code changes against real CI. Use for all tests, builds, - migrations, and any command that depends on secrets or services. - Use when testing, validating, checking, verifying, or before any - commit or push. + Validate code changes against real CI when local execution is not + enough. Use for CI-parity checks, secrets/services, migrations, or + builds/tests that cannot run reliably on the local machine. Do not + replace repo-documented local test/build loops just because this + skill exists. --- # Blacksmith Testbox +## Scope + +Use Testbox when you need remote CI parity, injected secrets, hosted services, +or an OS/runtime image that your local machine cannot provide cheaply. + +Do not default to Testbox for every local test/build loop. If the repo has +documented local commands for normal iteration, use those first so you keep +warm caches, local build state, and fast feedback. + +Testbox is the expensive path. Reach for it deliberately. + ## Install the CLI If `blacksmith` is not installed, install it: @@ -55,10 +67,41 @@ Optional flags: - `--dashboard-url ` — Override dashboard URL (e.g. for staging) +## Decide first: local or Testbox + +Before warming anything up, check the repo's own instructions. + +Prefer local commands when: + +- the repo documents a supported local test/build workflow +- you are iterating on unit tests, lint, typecheck, formatting, or other + local-only validation +- the value comes from warm local caches and fast repeat runs +- the command does not need remote secrets, hosted services, or CI-only images + +Prefer Testbox when: + +- the repo explicitly requires CI-parity or remote validation +- the command needs secrets, service containers, or provisioned infra +- you are reproducing CI-only failures +- you need the exact workflow image/job environment from GitHub Actions + +For OpenClaw specifically, normal local iteration should stay local: + +- `pnpm check:changed` +- `pnpm test:changed` +- `pnpm test ` +- `pnpm test:serial` +- `pnpm build` + +Only use Testbox in OpenClaw when the user explicitly wants CI-parity or the +check truly depends on remote secrets/services that the local repo loop cannot +provide. + ## Setup: Warmup before coding -Before writing any code, warm up a testbox. This returns an ID instantly -and boots the CI environment in the background while you work: +If you decided Testbox is actually warranted, warm one up early. This returns +an ID instantly and boots the CI environment in the background while you work: blacksmith testbox warmup ci-check-testbox.yml # → tbx_01jkz5b3t9... @@ -177,39 +220,45 @@ to the remote testbox before executing your command: testbox. The sync model keeps the remote in lockstep with your local managed file set. -## CRITICAL: Never run tests locally +## CRITICAL: Do not ban local tests -Your local machine does not have the secrets, services, or environment -parity needed to validate changes reliably. Every test run, migration, -build, or integration check MUST go through Testbox. +Do not assume local validation is forbidden. Many repos intentionally invest in +fast, warm local loops, and forcing every run through Testbox destroys that +advantage. + +Use Testbox for the checks that actually need it: remote parity, secrets, +services, CI-only runners, or reproducibility against the workflow image. + +If the repo says local tests/builds are the normal path, follow the repo. ## When to use -ALWAYS use Testbox when: +Use Testbox when: -- Running any test (unit, integration, e2e) -- Running database migrations -- Building the project to check for compile errors -- Running any command that depends on secrets or environment variables -- Validating changes before committing +- running database migrations or destructive environment checks +- running commands that depend on secrets or environment variables not present locally +- reproducing CI-only failures or validating against the workflow image +- validating behavior that needs provisioned services or remote runners +- doing a final parity check before commit/push when the repo or user wants that -The ONLY exception is trivial checks with zero external dependencies -(e.g., running a linter or formatter locally). +Trim that list based on repo guidance. If the repo documents supported local +tests/builds, prefer local for routine iteration and keep Testbox for the +checks that need parity or remote state. ## Workflow -1. Warm up immediately when you receive a coding task: +1. Decide whether the repo's local loop is the right default. +2. Only if Testbox is warranted, warm up early: `blacksmith testbox warmup ci-check-testbox.yml` → save the ID -2. Write code while the testbox boots in the background. -3. Run tests (the CLI auto-waits if the testbox isn't ready yet): +3. Write code while the testbox boots in the background. +4. Run the remote command when needed: `blacksmith testbox run --id "npm test"` -4. If tests fail, fix code and re-run (fast — same warm testbox, only - changed files are synced). -5. If you changed dependency manifests (package.json, etc.), prepend +5. If tests fail, fix code and re-run against the same warm box. +6. If you changed dependency manifests (package.json, etc.), prepend the install command: `blacksmith testbox run --id "npm install && npm test"` -6. If you need artifacts (coverage reports, build outputs, etc.), download them: +7. If you need artifacts (coverage reports, build outputs, etc.), download them: `blacksmith testbox download --id coverage/ ./coverage/` -7. Once green, commit and push. +8. Once green, commit and push. ## OpenClaw full test suite diff --git a/AGENTS.md b/AGENTS.md index 275e81d6af2..d5c2b633bf7 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -65,6 +65,7 @@ Scoped guides: - Normal full prod sweep: `pnpm check` (prod typecheck/lint/guards, no tests) - Full tests: `pnpm test` - Changed tests only: `pnpm test:changed` +- Local serial loop: `pnpm test:serial` - Extension tests: `pnpm test:extensions` or `pnpm test extensions` = all extension shards; `pnpm test extensions/` = one extension lane. Heavy channels/OpenAI have dedicated shards. - Shard timing artifact: `.artifacts/vitest-shard-timings.json`; auto-used for balanced shard ordering. Disable with `OPENCLAW_TEST_PROJECTS_TIMINGS=0`. - Targeted tests: `pnpm test [vitest args...]`; do not call raw `vitest`. @@ -84,6 +85,7 @@ Scoped guides: - `pnpm lint:apps`: Swift/app surface, separate from TS lint - `pnpm lint:all`: legacy comparison lane - Local heavy-check behavior: `OPENCLAW_LOCAL_CHECK=1` default; `OPENCLAW_LOCAL_CHECK_MODE=throttled|full`; `OPENCLAW_LOCAL_CHECK=0` for CI/shared runs. +- Local validation is local-first. Do not default to Blacksmith/Testbox for routine OpenClaw iteration; it burns warm caches and startup time. Use repo `pnpm` lanes first, then reach for remote CI/Testbox only for parity-only failures, secrets/services, or when explicitly requested. ## Gates