diff --git a/.agents/skills/blacksmith-testbox/SKILL.md b/.agents/skills/blacksmith-testbox/SKILL.md index cb9bf0b2602..af3d3159565 100644 --- a/.agents/skills/blacksmith-testbox/SKILL.md +++ b/.agents/skills/blacksmith-testbox/SKILL.md @@ -16,6 +16,19 @@ warm caches, local build state, and fast feedback. Testbox is the expensive path. Reach for it deliberately. +OpenClaw maintainers can opt into Testbox-first validation by setting +`OPENCLAW_TESTBOX=1` in their environment or standing agent rules. This mode is +maintainers-only and requires Blacksmith access. + +When `OPENCLAW_TESTBOX=1` is set in OpenClaw: + +- Pre-warm a Testbox early for longer, wider, or uncertain work. +- Prefer Testbox for `pnpm` gates, e2e, package-like proof, and broad suites. +- Reuse the same Testbox ID for every run command in the same task/session. +- Use local commands only when the task explicitly sets + `OPENCLAW_LOCAL_CHECK_MODE=throttled|full`, or when the user asks for local + proof. + ## Install the CLI If `blacksmith` is not installed, install it: @@ -81,7 +94,8 @@ Prefer Testbox when: - you are reproducing CI-only failures - you need the exact workflow image/job environment from GitHub Actions -For OpenClaw specifically, normal local iteration should stay local: +For OpenClaw specifically, normal local iteration stays local unless maintainer +Testbox mode is enabled with `OPENCLAW_TESTBOX=1`: - `pnpm check:changed` - `pnpm test:changed` @@ -89,27 +103,49 @@ For OpenClaw specifically, normal local iteration should stay local: - `pnpm test:serial` - `pnpm build` -Only use Testbox in OpenClaw when the user explicitly wants CI-parity or the -check truly depends on remote secrets/services that the local repo loop cannot -provide. +If `OPENCLAW_TESTBOX=1` is enabled, run those same repo commands inside the +warm Testbox. If the user wants laptop-friendly local proof for one command, use +the explicit escape hatch `OPENCLAW_LOCAL_CHECK_MODE=throttled`. + +For installable-package product proof, prefer the GitHub `Package Acceptance` +workflow over an ad hoc Testbox command. It resolves one package candidate +(`source=npm`, `source=ref`, `source=url`, or `source=artifact`), uploads it as +`package-under-test`, and runs the reusable Docker E2E lanes against that exact +tarball on GitHub/Blacksmith runners. Use `workflow_ref` for the trusted +workflow/harness code and `package_ref` for the source ref to pack when testing +an older trusted branch, tag, or SHA. ## Setup: Warmup before coding -If you decided Testbox is actually warranted, warm one up early. This returns -an ID instantly and boots the CI environment in the background while you work: +If you decided Testbox is warranted, warm one up early. This returns an ID +instantly and boots the CI environment in the background while you work: blacksmith testbox warmup ci-check-testbox.yml # → tbx_01jkz5b3t9... Save this ID. You need it for every `run` command. +For OpenClaw maintainer Testbox mode, pre-warm at the start of longer or wider +tasks: + + blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90 + +Use the build-artifact warmup when e2e/package/build proof benefits from seeded +`dist/`, `dist-runtime/`, and build-all caches: + + blacksmith testbox warmup ci-build-artifacts-testbox.yml --ref main --idle-timeout 90 + Warmup dispatches a GitHub Actions workflow that provisions a VM with the full CI environment: dependencies installed, services started, secrets injected, and a clean checkout of the repo at the default branch. +In OpenClaw, raw commit SHAs are not reliable dispatch refs for `warmup --ref`; +use a branch or tag. The build-artifact workflow resolves `openclaw@beta` and +`openclaw@latest` to SHA cache keys internally. + Options: - --ref Git ref to dispatch against (default: repo's default branch) + --ref Git ref to dispatch against (default: repo's default branch) --job Specific job within the workflow (if it has multiple) --idle-timeout Idle timeout in minutes (default: 30) @@ -226,6 +262,11 @@ services, CI-only runners, or reproducibility against the workflow image. If the repo says local tests/builds are the normal path, follow the repo. +OpenClaw maintainer exception: if `OPENCLAW_TESTBOX=1` is set by the user or +agent environment, treat Testbox as the normal validation path for this repo. +Use `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` as the explicit local escape +hatch. + ## When to use Use Testbox when: @@ -242,12 +283,13 @@ checks that need parity or remote state. ## Workflow -1. Decide whether the repo's local loop is the right default. -2. Only if Testbox is warranted, warm up early: - `blacksmith testbox warmup ci-check-testbox.yml` → save the ID +1. Decide whether the repo's local loop is the right default. For OpenClaw, + `OPENCLAW_TESTBOX=1` makes Testbox the maintainer default. +2. If Testbox is warranted, warm up early: + `blacksmith testbox warmup ci-check-testbox.yml --ref main --idle-timeout 90` → save the ID 3. Write code while the testbox boots in the background. 4. Run the remote command when needed: - `blacksmith testbox run --id "npm test"` + `blacksmith testbox run --id "pnpm check:changed"` 5. If tests fail, fix code and re-run against the same warm box. 6. If you changed dependency manifests (package.json, etc.), prepend the install command: `blacksmith testbox run --id "npm install && npm test"` @@ -268,9 +310,9 @@ Observed full-suite time on Blacksmith Testbox is about 3-4 minutes: - 173-180s on a warmed box - 219s on a fresh 32-vCPU box -When validating before commit/push, run `pnpm check:changed` first when -appropriate, then the full suite with the profile above if broad confidence is -needed. +When validating before commit/push in maintainer Testbox mode, run +`pnpm check:changed` inside the warmed box first when appropriate, then the full +suite with the profile above if broad confidence is needed. ## Examples @@ -324,12 +366,14 @@ timeout is reached). Default timeout is 5m; use `--wait-timeout` for longer blacksmith testbox stop --id Testboxes automatically shut down after being idle (default: 30 minutes). -If you need a longer session, increase the timeout at warmup time: +If you need a longer session, increase the timeout at warmup time. For OpenClaw +maintainer work, use 90 minutes for long-running sessions: - blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 60 + blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90 + blacksmith testbox warmup ci-build-artifacts-testbox.yml --idle-timeout 90 ## With options blacksmith testbox warmup ci-check-testbox.yml --ref main - blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 60 + blacksmith testbox warmup ci-check-testbox.yml --idle-timeout 90 blacksmith testbox run --id "go test ./..." diff --git a/.agents/skills/discord-clawd/SKILL.md b/.agents/skills/discord-clawd/SKILL.md new file mode 100644 index 00000000000..0cb26ff0018 --- /dev/null +++ b/.agents/skills/discord-clawd/SKILL.md @@ -0,0 +1,37 @@ +--- +name: discord-clawd +description: Use to talk to the Discord-backed OpenClaw agent/session; not for archive search. +--- + +# Discord Clawd + +Use this when the task is to talk with the Discord-backed agent/session, ask it a question, or post through that route. + +For Discord archive/history/search, use `$discrawl` instead. + +## Transport + +Use the OpenClaw relay helper: + +```bash +cd ~/Projects/agent-scripts +python3 skills/openclaw-relay/scripts/openclaw_relay.py targets +python3 skills/openclaw-relay/scripts/openclaw_relay.py resolve --target maintainers +``` + +If the target alias exists, prefer a private ask first: + +```bash +python3 skills/openclaw-relay/scripts/openclaw_relay.py ask \ + --target maintainers \ + --message "Reply with exactly OK." +``` + +Use `publish` when the session should decide whether to post. Use `force-send` only when the user explicitly wants a message posted. + +## Guardrails + +- Resolve the target before sending real content. +- Report the target and delivery mode used. +- Do not use this for local Discord archive queries. +- Do not expose gateway tokens or session secrets. diff --git a/.agents/skills/discord-clawd/agents/openai.yaml b/.agents/skills/discord-clawd/agents/openai.yaml new file mode 100644 index 00000000000..b5203eab2b0 --- /dev/null +++ b/.agents/skills/discord-clawd/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Discord Clawd" + short_description: "Talk to the Discord-backed OpenClaw agent" + default_prompt: "Use $discord-clawd to route a private ask or explicit post through the Discord-backed OpenClaw agent/session." diff --git a/.agents/skills/openclaw-pr-maintainer/SKILL.md b/.agents/skills/openclaw-pr-maintainer/SKILL.md index 5bea778d261..4f775656201 100644 --- a/.agents/skills/openclaw-pr-maintainer/SKILL.md +++ b/.agents/skills/openclaw-pr-maintainer/SKILL.md @@ -7,20 +7,20 @@ description: Review, triage, close, label, comment on, or land OpenClaw PRs/issu Use this skill for maintainer-facing GitHub workflow, not for ordinary code changes. -## Start issue and PR triage with ghcrawl +## Start issue and PR triage with gitcrawl -- Anytime you inspect OpenClaw issues or PRs, check local `ghcrawl` data first for related threads, duplicate attempts, and already-landed fixes. -- Use `ghcrawl` for candidate discovery and clustering; use `gh`, `gh api`, and the current checkout to verify live state before commenting, labeling, closing, or landing. -- If `ghcrawl` is missing, stale, lacks the target thread, or has no embeddings for neighbor/search commands, fall back to the GitHub search workflow below. -- Do not run expensive/update commands such as `ghcrawl refresh`, `ghcrawl embed`, or `ghcrawl cluster` unless the user asked to update the local store or the stale data is blocking the decision. +- Anytime you inspect OpenClaw issues or PRs, check local `gitcrawl` data first for related threads, duplicate attempts, and already-landed fixes. +- Use `gitcrawl` for candidate discovery and clustering; use `gh`, `gh api`, and the current checkout to verify live state before commenting, labeling, closing, or landing. +- If `gitcrawl` is missing, stale, lacks the target thread, or has no embeddings for neighbor/search commands, fall back to the GitHub search workflow below. +- Do not run expensive/update commands such as `gitcrawl sync --include-comments`, future enrichment commands, or broad reclustering unless the user asked to update the local store or stale data is blocking the decision. Common read-only path: ```bash -ghcrawl threads openclaw/openclaw --numbers --include-closed --json -ghcrawl neighbors openclaw/openclaw --number --limit 12 --json -ghcrawl search openclaw/openclaw --query "" --mode hybrid --json -ghcrawl cluster-detail openclaw/openclaw --id --member-limit 20 --body-chars 280 --json +gitcrawl threads openclaw/openclaw --numbers --include-closed --json +gitcrawl neighbors openclaw/openclaw --number --limit 12 --json +gitcrawl search openclaw/openclaw --query "" --mode hybrid --json +gitcrawl cluster-detail openclaw/openclaw --id --member-limit 20 --body-chars 280 --json ``` ## Apply close and triage labels correctly @@ -75,7 +75,7 @@ ghcrawl cluster-detail openclaw/openclaw --id --member-limit 20 --b ## Search broadly before deciding -- Prefer `ghcrawl` first. Then use targeted GitHub keyword search to verify gaps, live status, comments, and candidates not present in the local store. +- Prefer `gitcrawl` first. Then use targeted GitHub keyword search to verify gaps, live status, comments, and candidates not present in the local store. - Use `--repo openclaw/openclaw` with `--match title,body` first when using `gh search`. - Add `--match comments` when triaging follow-up discussion or closed-as-duplicate chains. - Do not stop at the first 500 results when the task requires a full search. diff --git a/.agents/skills/openclaw-qa-testing/SKILL.md b/.agents/skills/openclaw-qa-testing/SKILL.md index ade3b448382..151634527ff 100644 --- a/.agents/skills/openclaw-qa-testing/SKILL.md +++ b/.agents/skills/openclaw-qa-testing/SKILL.md @@ -62,6 +62,24 @@ scenario through qa-channel, decodes the emitted protobuf spans, and verifies the exported trace names and privacy contract. It does not require Opik, Langfuse, or external collector credentials. +## Matrix live profiles + +`pnpm openclaw qa matrix` defaults to the full `all` profile. Use explicit +profiles for faster CI/release proof: + +```bash +OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS=3000 \ +pnpm openclaw qa matrix --profile fast --fail-fast +``` + +- `fast`: release-critical transport contract, excluding generated image and + deep E2EE recovery inventory. +- `transport`, `media`, `e2ee-smoke`, `e2ee-deep`, `e2ee-cli`: sharded full + Matrix coverage. +- `QA-Lab - All Lanes` uses explicit `fast` Matrix on scheduled runs. Manual + dispatch keeps `matrix_profile=all` as the default and always shards that full + Matrix selection. + ## QA credentials and 1Password - Use `op` only inside `tmux` for QA secret lookup in this repo. diff --git a/.agents/skills/openclaw-release-maintainer/SKILL.md b/.agents/skills/openclaw-release-maintainer/SKILL.md index 3aa8fbb179b..19c1c58f820 100644 --- a/.agents/skills/openclaw-release-maintainer/SKILL.md +++ b/.agents/skills/openclaw-release-maintainer/SKILL.md @@ -325,9 +325,11 @@ node --import tsx scripts/openclaw-npm-postpublish-verify.ts - Docker install/update coverage that exercises the published beta package - published npm Telegram proof: dispatch Actions > `NPM Telegram Beta E2E` from `main` with `package_spec=openclaw@` and - `provider_mode=mock-openai`, approve `npm-release`, and require success. - This is the default button path for installed-package onboarding, - Telegram setup, and real Telegram E2E against the published npm package. + `provider_mode=mock-openai`, and require success. This workflow is + maintainer-dispatched and intentionally has no `npm-release` approval gate; + `qa-live-shared` only supplies the shared QA secrets. This is the default + button path for installed-package onboarding, Telegram setup, and real + Telegram E2E against the published npm package. Use the local `pnpm test:docker:npm-telegram-live` lane with the matching `OPENCLAW_NPM_TELEGRAM_PACKAGE_SPEC` and Convex CI env only as a fallback or debugging path. diff --git a/.agents/skills/openclaw-testing/SKILL.md b/.agents/skills/openclaw-testing/SKILL.md new file mode 100644 index 00000000000..841c3320960 --- /dev/null +++ b/.agents/skills/openclaw-testing/SKILL.md @@ -0,0 +1,492 @@ +--- +name: openclaw-testing +description: Choose, run, rerun, or debug OpenClaw tests, CI checks, Docker E2E lanes, release validation, and the cheapest safe verification path. +--- + +# OpenClaw Testing + +Use this skill when deciding what to test, debugging failures, rerunning CI, +or validating a change without wasting hours. + +## Read First + +- `docs/reference/test.md` for local test commands. +- `docs/ci.md` for CI scope, release checks, Docker chunks, and runner behavior. +- Scoped `AGENTS.md` files before editing code under a subtree. + +## Default Rule + +Prove the touched surface first. Do not reflexively run the whole suite. + +1. Inspect the diff and classify the touched surface: + - source: `pnpm changed:lanes --json`, then `pnpm check:changed` + - tests only: `pnpm test:changed` + - one failing file: `pnpm test -- --reporter=verbose` + - workflow-only: `git diff --check`, workflow syntax/lint (`actionlint` when available) + - docs-only: `pnpm docs:list`, docs formatter/lint only if docs tooling changed or requested +2. Reproduce narrowly before fixing. +3. Fix root cause. +4. Rerun the same narrow proof. +5. Broaden only when the touched contract demands it. + +## Guardrails + +- Do not kill unrelated processes or tests. If something is running elsewhere, treat it as owned by the user or another agent. +- Do not run expensive local Docker, full release checks, full `pnpm test`, or full `pnpm check` unless the user asks or the change genuinely requires it. +- Prefer GitHub Actions for release/Docker proof when the workflow already has the prepared image and secrets. +- Use `scripts/committer "" ` when committing; stage only your files. +- If deps are missing, run `pnpm install`, retry once, then report the first actionable error. + +## Local Test Shortcuts + +```bash +pnpm changed:lanes --json +pnpm check:changed # changed typecheck/lint/guards; no Vitest +pnpm test:changed # cheap smart changed Vitest targets +OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed +pnpm test -- --reporter=verbose +OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test +``` + +Use targeted file paths whenever possible. Avoid raw `vitest`; use the repo +`pnpm test` wrapper so project routing, workers, and setup stay correct. + +## Command Semantics + +- `pnpm check` and `pnpm check:changed` do not run Vitest tests. They are for + typecheck, lint, and guard proof. +- `pnpm test` and `pnpm test:changed` run Vitest tests. +- `pnpm test:changed` is intentionally cheap by default: direct test edits, + sibling tests, explicit source mappings, and import-graph dependents. +- `OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed` is the explicit broad + fallback for harness/config/package edits that genuinely need it. +- Do not run extension sweeps just because core changed. If a core edit is for a + specific plugin bug, run that plugin's tests explicitly. If a public SDK or + contract change needs consumer proof, choose the smallest representative + plugin/contract tests first, then broaden only when the risk justifies it. +- The test wrapper prints a short `[test] passed|failed|skipped ... in ...` + line. Vitest's own duration is still the per-shard detail. + +## Routing Model + +- `pnpm changed:lanes --json` answers "which check lanes does this diff touch?" + It is used by `pnpm check:changed` for typecheck/lint/guard selection. +- `pnpm test:changed` answers "which Vitest targets are worth running now?" It + uses the same changed path list, but applies a cheaper test-target resolver. +- Direct test edits run themselves. Source edits prefer explicit mappings, + sibling `*.test.ts`, then import-graph dependents. Shared harness/config/root + edits are skipped by default unless they have precise mapped tests. +- Public SDK or contract edits do not automatically run every plugin test. + `check:changed` proves extension type contracts; the agent chooses the + smallest plugin/contract Vitest proof that matches the actual risk. +- Use `OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed` only when a harness, + config, package, or unknown-root edit really needs the broad Vitest fallback. + +## CI Debugging + +Start with current run state, not logs for everything: + +```bash +gh run list --branch main --limit 10 +gh run view --json status,conclusion,headSha,url,jobs +gh run view --job --log +``` + +- Check exact SHA. Ignore newer unrelated `main` unless asked. +- For cancelled same-branch runs, confirm whether a newer run superseded it. +- Fetch full logs only for failed or relevant jobs. + +## GitHub Release Workflows + +Use the smallest workflow that proves the current risk. The full umbrella is +available, but it is usually the last step after narrower proof, not the first +rerun after a focused patch. + +### Full Release Validation + +`Full Release Validation` (`.github/workflows/full-release-validation.yml`) is +the manual "everything before release" umbrella. It resolves a target ref, then +dispatches: + +- manual `CI` for the full normal CI graph +- `OpenClaw Release Checks` for install smoke, cross-OS release checks, live and + E2E checks, Docker release-path suites, OpenWebUI, QA Lab, fast Matrix, and + Telegram release lanes +- optional post-publish Telegram E2E when a package spec is supplied + +Run it only when validating an actual release candidate, after broad shared CI +or release orchestration changes, or when explicitly asked: + +```bash +gh workflow run full-release-validation.yml \ + --repo openclaw/openclaw \ + --ref main \ + -f ref= \ + -f workflow_ref=main \ + -f provider=openai \ + -f mode=both +``` + +If a full run is already active on a newer `origin/main`, prefer watching that +run over dispatching a duplicate. If you accidentally dispatch a stale duplicate, +cancel it and monitor the current run. + +### Release Evidence + +After release-candidate validation or before a release decision, record the +important run ids in the private `openclaw/releases-private` evidence ledger. +Use the manual `OpenClaw Release Evidence` +(`openclaw-release-evidence.yml`) workflow there. It writes durable summaries +under `evidence//` and commits: + +- `release-evidence.md` +- `release-evidence.json` +- `index.json` +- `runs/