fix: route crabbox proof through brokered aws

This commit is contained in:
Vincent Koc
2026-05-16 02:34:48 +08:00
parent b21d3e5362
commit a6aa48350b
4 changed files with 128 additions and 75 deletions

View File

@@ -1,6 +1,6 @@
---
name: crabbox
description: Use Crabbox for OpenClaw remote validation across Linux, macOS, Windows, and WSL2. Default to Blacksmith Testbox for broad Linux proof; includes direct Blacksmith and owned AWS/Hetzner fallback notes when Crabbox fails.
description: Use Crabbox for OpenClaw remote validation across Linux, macOS, Windows, and WSL2. Default to the repo Crabbox config, use brokered AWS for normal broad proof, and keep Blacksmith Testbox as an explicit opt-in or outage diagnostic path.
---
# Crabbox
@@ -9,9 +9,15 @@ Use Crabbox when OpenClaw needs remote Linux proof for broad tests, CI-parity
checks, secrets, hosted services, Docker/E2E/package lanes, warmed reusable
boxes, sync timing, logs/results, cache inspection, or lease cleanup.
Default backend: `blacksmith-testbox`. The separate `blacksmith-testbox` skill
has been removed; this skill owns both the normal Crabbox path and the direct
Blacksmith fallback playbook.
Default backend: the repo `.crabbox.yaml`, currently brokered AWS. Do not
override it to Blacksmith unless the user explicitly asks for Blacksmith proof,
the task is specifically about Testbox behavior, or AWS/brokered Crabbox is the
broken layer.
Blacksmith Testbox is a delegated fallback, not the default router. If a
Blacksmith run queues, fails capacity, fails auth, or cannot allocate, stop
after one real attempt and switch to the repo default or report the blocker.
Do not retry Blacksmith in a loop.
## First Checks
@@ -28,9 +34,10 @@ pnpm crabbox:run -- --help | sed -n '1,120p'
- OpenClaw scripts prefer `../crabbox/bin/crabbox` when present. The user PATH
shim can be stale.
- Check `.crabbox.yaml` for repo defaults, but override provider explicitly.
Even if config still says AWS, maintainer validation should normally pass
`--provider blacksmith-testbox`.
- Check `.crabbox.yaml` for repo defaults and honor them. For normal Linux
validation, omit `--provider` so the wrapper uses brokered AWS.
- Pass `--provider blacksmith-testbox` only for explicit Blacksmith/Testbox
work or a deliberate comparison.
- If a warm direct-provider lease smells stale, retry with `--full-resync`
(alias `--fresh-sync`) before replacing the lease. This resets the remote
workdir, skips the fingerprint fast path, reseeds Git when possible, and
@@ -54,7 +61,8 @@ pnpm crabbox:run -- --help | sed -n '1,120p'
## macOS And Windows Targets
Use these only when the task needs an existing non-Linux host. OpenClaw broad
validation still defaults to `blacksmith-testbox`.
Linux validation uses the repo Crabbox config unless a provider is explicitly
requested.
Crabbox supports static SSH targets:
@@ -75,7 +83,7 @@ Crabbox supports static SSH targets:
with `../crabbox/bin/crabbox run --help`, config/flag tests, and the Crabbox
Go test suite.
## Default Blacksmith Backend
## Default Brokered AWS Backend
Use this for `pnpm check`, `pnpm check:changed`, `pnpm test`,
`pnpm test:changed`, Docker/E2E/live/package gates, or anything likely to fan
@@ -84,11 +92,7 @@ out across many Vitest projects.
Changed gate:
```sh
pnpm crabbox:run -- --provider blacksmith-testbox \
--blacksmith-org openclaw \
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
--blacksmith-job check \
--blacksmith-ref main \
pnpm crabbox:run -- \
--idle-timeout 90m \
--ttl 240m \
--timing-json \
@@ -99,11 +103,7 @@ pnpm crabbox:run -- --provider blacksmith-testbox \
Full suite:
```sh
pnpm crabbox:run -- --provider blacksmith-testbox \
--blacksmith-org openclaw \
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
--blacksmith-job check \
--blacksmith-ref main \
pnpm crabbox:run -- \
--idle-timeout 90m \
--ttl 240m \
--timing-json \
@@ -114,11 +114,7 @@ pnpm crabbox:run -- --provider blacksmith-testbox \
Focused rerun:
```sh
pnpm crabbox:run -- --provider blacksmith-testbox \
--blacksmith-org openclaw \
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
--blacksmith-job check \
--blacksmith-ref main \
pnpm crabbox:run -- \
--idle-timeout 90m \
--ttl 240m \
--timing-json \
@@ -128,19 +124,18 @@ pnpm crabbox:run -- --provider blacksmith-testbox \
Read the JSON summary. Useful fields:
- `provider`: should be `blacksmith-testbox`
- `leaseId`: `tbx_...`
- `syncDelegated`: should be `true`
- `provider`: should normally be `aws`
- `leaseId`: `cbx_...`
- `syncDelegated`: should normally be `false`
- `commandPhases`: populated when the command prints `CRABBOX_PHASE:<name>`
- `commandMs` / `totalMs`
- `exitCode`
Crabbox should stop one-shot Blacksmith Testboxes automatically after the run.
Verify cleanup when a run fails, is interrupted, or the command output is
unclear:
Crabbox should stop one-shot AWS leases automatically after the run. Verify
cleanup when a run fails, is interrupted, or the command output is unclear:
```sh
blacksmith testbox list
../crabbox/bin/crabbox list --provider aws
```
## Observability Flags
@@ -331,13 +326,13 @@ Interactive CLI/onboarding:
## Reuse And Keepalive
For most Blacksmith-backed Crabbox calls, one-shot is enough. Use reuse only
when you need multiple manual commands on the same hydrated box.
For most Crabbox calls, one-shot is enough. Use reuse only when you need
multiple manual commands on the same hydrated box.
If Crabbox returns a reusable id or you intentionally keep a lease:
```sh
pnpm crabbox:run -- --provider blacksmith-testbox --id <tbx_id> --no-sync --timing-json --shell -- "pnpm test <path>"
pnpm crabbox:run -- --id <cbx_id-or-slug> --no-sync --timing-json --shell -- "pnpm test <path>"
```
Stop boxes you created before handoff:
@@ -386,14 +381,16 @@ WebVNC portal, and opens the portal. Keep browsers windowed for human QA; use
## If Crabbox Fails
Keep the fallback narrow. First decide whether the failure is Crabbox itself,
Blacksmith/Testbox, repo hydration, sync, or the test command.
the brokered AWS lease, Blacksmith/Testbox, repo hydration, sync, or the test
command.
Fast checks:
```sh
command -v crabbox
../crabbox/bin/crabbox --version
crabbox run --provider blacksmith-testbox --help | sed -n '1,140p'
pnpm crabbox:run -- --help | sed -n '1,140p'
../crabbox/bin/crabbox doctor
command -v blacksmith
blacksmith --version
blacksmith testbox list
@@ -403,36 +400,36 @@ Common Crabbox-only failures:
- Provider missing or old CLI: use `../crabbox/bin/crabbox` from the sibling
repo, or update/install Crabbox before retrying.
- Bad local config: pass `--provider blacksmith-testbox` plus explicit
`--blacksmith-*` flags instead of relying on `.crabbox.yaml`.
- Slug/claim confusion: use the raw `tbx_...` id, or run one-shot without
`--id`.
- Bad local config: inspect `.crabbox.yaml`, `crabbox config show`, and
`crabbox whoami`; normal OpenClaw proof should use brokered AWS without
asking for cloud keys.
- Slug/claim confusion: use the raw `cbx_...` / `tbx_...` id, or run one-shot
without `--id`.
- Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the
printed Actions URL. Large sync warnings now include top source directories
by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect
those before reaching for `--force-sync-large`. Quiet rsync watchdogs and SSH
timeouts now print `next_action=` hints; follow them, usually `--full-resync`
first and a fresh lease second.
- Cleanup uncertainty: run `blacksmith testbox list` and stop only boxes you
- Cleanup uncertainty: run `crabbox list --provider aws`; for explicit
Blacksmith runs, use `blacksmith testbox list` and stop only boxes you
created.
- Testbox queued/capacity pressure: do not convert a broad changed gate or full
suite into local `OPENCLAW_LOCAL_CHECK_MODE=throttled pnpm ...`. Leave the
remote lane queued, switch to a narrower targeted local check, or stop and
report the capacity blocker.
- Testbox queued/capacity pressure: do not retry Blacksmith repeatedly. Rerun
once without `--provider` so `.crabbox.yaml` routes to brokered AWS, or report
the Blacksmith blocker if Testbox itself is the requested proof.
If Crabbox cannot dispatch, sync, attach, or stop but Blacksmith itself works,
first try the same command through the repo wrapper with `--debug` and
`--timing-json`:
If brokered AWS cannot dispatch, sync, attach, or stop, retry once with
`--debug` and `--timing-json`:
```sh
pnpm crabbox:run -- --provider blacksmith-testbox --debug --timing-json -- \
pnpm crabbox:run -- --debug --timing-json -- \
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed
```
Full suite:
```sh
pnpm crabbox:run -- --provider blacksmith-testbox --debug --timing-json -- \
pnpm crabbox:run -- --debug --timing-json -- \
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test
```
@@ -451,9 +448,10 @@ Raw Blacksmith footguns:
- Treat `blacksmith testbox list` as cleanup diagnostics, not a shared reusable
queue.
Escalate to owned AWS/Hetzner only when Blacksmith is down, quota-limited,
missing the needed environment, or owned capacity is the explicit goal. Use the
Owned Cloud Fallback section below.
Use Blacksmith only when the task is specifically about Testbox, brokered AWS
is unavailable, or an explicit comparison is needed. If Blacksmith is down or
quota-limited, do not keep probing it; stay on brokered AWS and note the
delegated-provider outage.
## Blacksmith Backend Notes
@@ -489,13 +487,14 @@ Important Blacksmith footguns:
blacksmith auth login --non-interactive --organization openclaw
```
## Owned Cloud Fallback
## Brokered AWS
Use AWS/Hetzner only when Blacksmith is down, quota-limited, missing the needed
environment, or owned capacity is explicitly the goal.
Use AWS for normal OpenClaw remote proof. The repo `.crabbox.yaml` already
selects brokered AWS, so omit `--provider` unless you are testing a different
provider deliberately.
```sh
pnpm crabbox:warmup -- --provider aws --class beast --market on-demand --idle-timeout 90m
pnpm crabbox:warmup -- --class beast --market on-demand --idle-timeout 90m
pnpm crabbox:hydrate -- --id <cbx_id-or-slug>
pnpm crabbox:run -- --id <cbx_id-or-slug> --timing-json --shell -- "env NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed"
pnpm crabbox:stop -- <cbx_id-or-slug>
@@ -519,8 +518,8 @@ crabbox whoami
- If broker auth is missing, run `crabbox login --url https://crabbox.openclaw.ai --provider aws`.
- If the CLI asks for `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, or AWS
profile setup during normal OpenClaw validation, assume the agent selected
the wrong path. Use brokered `crabbox login`, `--provider blacksmith-testbox`,
or an existing brokered lease before asking the user for cloud credentials.
the wrong path. Use brokered `crabbox login` or an existing brokered lease
before asking the user for cloud credentials.
- Ask for AWS keys only for explicit direct-provider/account administration,
not for normal brokered OpenClaw proof.
- Trusted automation may still use
@@ -533,8 +532,7 @@ macOS config lives at:
```
It should include `broker.url`, `broker.token`, and usually `provider: aws`
for owned-cloud lanes. Do not let that config override the OpenClaw default
when Blacksmith proof is requested; pass `--provider blacksmith-testbox`.
for OpenClaw lanes. Let that config drive normal validation.
### Interactive Desktop / WebVNC
@@ -572,14 +570,15 @@ Use `--market spot|on-demand` only on AWS warmup/one-shot runs.
## Failure Triage
- Crabbox cannot find provider: verify `../crabbox/bin/crabbox --help` lists
`blacksmith-testbox`; update Crabbox before falling back.
the provider selected by `.crabbox.yaml`; update Crabbox before falling back.
- Hydration stuck or failed: open the printed GitHub Actions run URL and inspect
the hydration step.
- Sync failed: rerun with `--debug`; check changed-file count and whether the
checkout is dirty.
- Command failed: rerun only the failing shard/file first. Do not rerun a full
suite until the focused failure is understood.
- Cleanup uncertain: `blacksmith testbox list`; stop owned `tbx_...` leases you
- Cleanup uncertain: `crabbox list --provider aws`; for explicit Blacksmith
runs, use `blacksmith testbox list` and stop owned `tbx_...` leases you
created.
- Crabbox broken but Blacksmith works: use the direct Blacksmith fallback above,
then file/fix the Crabbox issue.

View File

@@ -23,7 +23,9 @@ Prove the touched surface first. Do not reflexively run the whole suite.
- normal source checkout, tests only: `pnpm test:changed`
- normal source checkout, one failing file: `pnpm test <path-or-filter> -- --reporter=verbose`
- Codex worktree or linked/sparse checkout, one/few explicit files: `node scripts/run-vitest.mjs <path-or-filter>`
- Codex worktree or linked/sparse checkout, changed gates or anything broad: `node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox ... --shell -- "pnpm check:changed"`
- Codex worktree or linked/sparse checkout, changed gates or anything broad:
`node scripts/crabbox-wrapper.mjs run ... --shell -- "pnpm check:changed"`
and let `.crabbox.yaml` choose the provider
- workflow-only: `git diff --check`, workflow syntax/lint (`actionlint` when available)
- docs-only: `pnpm docs:list`, docs formatter/lint only if docs tooling changed or requested
2. Reproduce narrowly before fixing.
@@ -44,11 +46,13 @@ Prove the touched surface first. Do not reflexively run the whole suite.
`node scripts/run-vitest.mjs` for tiny local proof, `node
scripts/crabbox-wrapper.mjs` for Testbox, and `git commit --no-verify` only
after the relevant remote or node-wrapper proof is already clean.
- For Blacksmith Testbox proof, use Crabbox first. `pnpm crabbox:run -- --provider
blacksmith-testbox --timing-json -- <command...>` warms, claims, syncs, runs,
reports, and cleans up one-shot boxes. Reuse only an id/slug created in this
operator session; `blacksmith testbox list` is diagnostics only, not a shared
work queue.
- For remote proof, use Crabbox first and omit `--provider` unless a specific
provider is being tested. The repo Crabbox config routes normal broad proof to
brokered AWS. Blacksmith Testbox is explicit opt-in; if it queues, fails
capacity, or cannot allocate, retry once through the default Crabbox route or
report the Testbox blocker. Reuse only an id/slug created in this operator
session; `blacksmith testbox list` is diagnostics only, not a shared work
queue.
## Local Test Shortcuts

View File

@@ -6,6 +6,7 @@ Docs: https://docs.openclaw.ai
### Changes
- Maintainer tooling: route Crabbox skill defaults through the repo brokered AWS config, leaving Blacksmith Testbox as an explicit opt-in instead of the broad-proof default.
- CLI/onboarding: localize the setup wizard and bundled channel setup flows for English, Simplified Chinese, and Traditional Chinese. (#80645) Thanks @GaosCode.
- Agents/skills: cache hydrated `resolvedSkills` across warm gateway turns while keying reuse by the redacted effective config, reducing redundant skill snapshot rebuilds without crossing config-gated skill boundaries. (#81451) Thanks @solodmd.

View File

@@ -1,6 +1,6 @@
#!/usr/bin/env node
import { spawn, spawnSync } from "node:child_process";
import { existsSync } from "node:fs";
import { existsSync, readFileSync } from "node:fs";
import { dirname, relative, resolve } from "node:path";
import { fileURLToPath } from "node:url";
@@ -29,15 +29,58 @@ function checkedOutput(command, commandArgs) {
};
}
function configuredProvider() {
try {
const config = readFileSync(resolve(repoRoot, ".crabbox.yaml"), "utf8");
const match = config.match(/^provider:\s*([^\s#]+)/m);
return match?.[1] ?? "aws";
} catch {
return "aws";
}
}
function selectedProvider(commandArgs) {
for (let index = 0; index < commandArgs.length; index += 1) {
const arg = commandArgs[index];
if (arg === "--provider") {
return commandArgs[index + 1] ?? "";
}
if (arg.startsWith("--provider=")) {
return arg.slice("--provider=".length);
}
}
return configuredProvider();
}
const version = checkedOutput(binary, ["--version"]);
const help = checkedOutput(binary, ["run", "--help"]);
const providers = ["hetzner", "aws", "blacksmith-testbox"].filter((provider) =>
help.text.includes(provider),
const knownProviders = [
"hetzner",
"aws",
"azure",
"gcp",
"proxmox",
"ssh",
"blacksmith-testbox",
"namespace-devbox",
"semaphore",
"daytona",
"islo",
"e2b",
"modal",
"sprites",
"cloudflare",
];
const providers = knownProviders.filter((provider) =>
new RegExp(`\\b${provider.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")}\\b`).test(
help.text,
),
);
const displayBinary = binary === "crabbox" ? "crabbox" : relative(repoRoot, binary);
const provider = selectedProvider(args);
console.error(
`[crabbox] bin=${displayBinary} version=${version.text || "unknown"} providers=${providers.join(",") || "unknown"}`,
`[crabbox] bin=${displayBinary} version=${version.text || "unknown"} provider=${provider || "unknown"} providers=${providers.join(",") || "unknown"}`,
);
if (version.status !== 0 || help.status !== 0) {
@@ -45,13 +88,19 @@ if (version.status !== 0 || help.status !== 0) {
process.exit(2);
}
if (!providers.includes("blacksmith-testbox")) {
if (provider && knownProviders.includes(provider) && !providers.includes(provider)) {
console.error(
"[crabbox] selected binary does not advertise provider blacksmith-testbox; refusing stale Crabbox binary",
`[crabbox] selected binary does not advertise provider ${provider}; update Crabbox or choose a supported provider`,
);
process.exit(2);
}
if (provider === "blacksmith-testbox") {
console.error(
"[crabbox] provider=blacksmith-testbox explicit; if Testbox is queued or down, rerun without --provider to use .crabbox.yaml",
);
}
const child = spawn(binary, args, {
cwd: repoRoot,
stdio: "inherit",