docs: add Mantis Slack desktop runbook

2026-05-06 05:50:43 +00:00 · 2026-05-05 23:48:48 +01:00
parent 92b04557a6
commit 430814ebc1
2 changed files with 226 additions and 20 deletions
--- a/docs/concepts/mantis-slack-desktop-runbook.md
+++ b/docs/concepts/mantis-slack-desktop-runbook.md
@@ -0,0 +1,202 @@
+---
+summary: "Operator runbook for Mantis Slack desktop QA: GitHub dispatch, local CLI, warm VNC leases, hydrate modes, timing interpretation, artifacts, and failure handling."
+read_when:
+  - Running Mantis Slack desktop QA from GitHub or locally
+  - Debugging slow Mantis Slack desktop runs
+  - Choosing source, prehydrated, or warm-lease mode
+  - Posting screenshot and video evidence to a PR
+title: "Mantis Slack Desktop Runbook"
+---
+
+Mantis Slack desktop QA is the real-UI lane for Slack-class bugs that need a
+Linux desktop, VNC rescue, Slack Web, a real OpenClaw gateway, screenshots,
+videos, and a PR evidence comment.
+
+Use it when unit tests or the headless Slack live lane cannot prove the bug.
+
+## Storage Model
+
+Mantis uses three different storage layers:
+
+- Provider image: owned by Crabbox and stored in the cloud provider account.
+  It contains machine capabilities such as Chrome/Chromium, ffmpeg, scrot,
+  Node/corepack/pnpm, native build tools, and empty cache directories.
+- Warm lease state: owned by the current operator session. It can contain a
+  logged-in browser profile, `/var/cache/crabbox/pnpm`, and a prepared source
+  checkout while the lease is alive.
+- Mantis artifacts: owned by the OpenClaw run. They live under
+  `.artifacts/qa-e2e/mantis/...`, then GitHub Actions uploads them and the
+  Mantis GitHub App comments inline evidence on the PR.
+
+Never put secrets, browser cookies, Slack login state, repository checkouts,
+`node_modules`, or `dist/` into a prebaked provider image.
+
+## GitHub Dispatch
+
+Run the workflow from `main`:
+
+```bash
+gh workflow run mantis-slack-desktop-smoke.yml \
+  --ref main \
+  -f candidate_ref=<trusted-ref-or-sha> \
+  -f pr_number=<pr-number> \
+  -f scenario_id=slack-canary \
+  -f crabbox_provider=aws \
+  -f keep_vm=false \
+  -f hydrate_mode=source
+```
+
+Allowed `candidate_ref` values are intentionally narrow because the workflow
+uses live credentials: current `main` ancestry, release tags, or an open PR head
+from `openclaw/openclaw`.
+
+The workflow writes:
+
+- uploaded artifact: `mantis-slack-desktop-smoke-<run-id>-<attempt>`;
+- inline PR comment from the Mantis GitHub App;
+- `slack-desktop-smoke.png`;
+- `slack-desktop-smoke.mp4`;
+- `slack-desktop-smoke-preview.gif`;
+- `slack-desktop-smoke-change.mp4`;
+- `mantis-slack-desktop-smoke-summary.json`;
+- `mantis-slack-desktop-smoke-report.md`;
+- remote logs such as `slack-desktop-command.log`, `openclaw-gateway.log`,
+  `chrome.log`, and `ffmpeg.log`.
+
+The PR comment is updated in place by the hidden
+`<!-- mantis-slack-desktop-smoke -->` marker.
+
+## Local CLI
+
+Cold source proof:
+
+```bash
+pnpm openclaw qa mantis slack-desktop-smoke \
+  --provider aws \
+  --class standard \
+  --gateway-setup \
+  --credential-source convex \
+  --credential-role maintainer \
+  --provider-mode live-frontier \
+  --model openai/gpt-5.4 \
+  --alt-model openai/gpt-5.4 \
+  --scenario slack-canary \
+  --hydrate-mode source
+```
+
+Keep the VM for VNC rescue:
+
+```bash
+pnpm openclaw qa mantis slack-desktop-smoke \
+  --provider aws \
+  --class standard \
+  --gateway-setup \
+  --scenario slack-canary \
+  --keep-lease
+```
+
+Open VNC:
+
+```bash
+crabbox vnc --provider aws --id <cbx_id> --open
+```
+
+Reuse a warm lease:
+
+```bash
+pnpm openclaw qa mantis slack-desktop-smoke \
+  --provider aws \
+  --lease-id <cbx_id-or-slug> \
+  --gateway-setup \
+  --scenario slack-canary \
+  --hydrate-mode source
+```
+
+Use `--hydrate-mode prehydrated` only when the reused remote workspace already
+has `node_modules` and a built `dist/`. Mantis fails closed if those are
+missing.
+
+## Hydrate Modes
+
+| Mode          | Use when                                  | Remote behavior                                                                       | Tradeoff                                                 |
+| ------------- | ----------------------------------------- | ------------------------------------------------------------------------------------- | -------------------------------------------------------- |
+| `source`      | Normal PR proof, cold machines, CI        | Runs `pnpm install --frozen-lockfile --prefer-offline` and `pnpm build` inside the VM | Slowest, strongest source-checkout proof                 |
+| `prehydrated` | You intentionally prepared a reused lease | Requires existing `node_modules` and `dist/`; skips install/build                     | Fast, but only valid for operator-controlled warm leases |
+
+GitHub Actions always prepares the candidate checkout before the VM run. Its
+pnpm store is cached by OS, Node version, and lockfile. The VM source run also
+uses `/var/cache/crabbox/pnpm` when present.
+
+## Timing Interpretation
+
+`mantis-slack-desktop-smoke-report.md` includes phase timings:
+
+- `crabbox.warmup`: cloud provider boot, desktop/browser readiness, and SSH.
+- `crabbox.inspect`: lease metadata lookup.
+- `credentials.prepare`: Convex credential lease acquisition.
+- `crabbox.remote_run`: sync, browser launch, OpenClaw install/build or
+  hydrate validation, gateway startup, screenshot, and video capture.
+- `artifacts.copy`: rsync back from the VM.
+
+`crabbox.remote_run` can be marked `accepted` when Crabbox returns a non-zero
+remote status after Mantis has copied metadata proving that the OpenClaw gateway
+is alive and the setup completed. Treat `accepted` as pass-with-explanation,
+not a failed scenario.
+
+If the run is slow:
+
+- warmup dominates: prebake or promote a better Crabbox provider image;
+- remote_run dominates in `source`: use a warm lease, improve pnpm store reuse,
+  or move machine prerequisites into the provider image;
+- remote_run dominates in `prehydrated`: the remote workspace was not actually
+  ready, or the gateway/browser/Slack setup is slow;
+- artifact copy dominates: inspect video size and artifact directory contents.
+
+## Evidence Checklist
+
+A good PR comment should show:
+
+- scenario id and candidate SHA;
+- GitHub Actions run URL;
+- artifact URL;
+- inline screenshot;
+- inline animated preview when available;
+- full MP4 and trimmed MP4 links;
+- pass/fail status;
+- timing summary in the attached report.
+
+Do not commit screenshots or videos into the repository. Keep them in GitHub
+Actions artifacts or the PR comment.
+
+## Failure Handling
+
+If the workflow fails before the VM run, inspect the Actions job first. Typical
+causes are untrusted `candidate_ref`, missing environment secrets, or candidate
+install/build failure.
+
+If the VM run fails but screenshots were copied back, inspect:
+
+```bash
+cat mantis-slack-desktop-smoke-report.md
+cat mantis-slack-desktop-smoke-summary.json
+cat slack-desktop-command.log
+cat openclaw-gateway.log
+cat chrome.log
+cat ffmpeg.log
+```
+
+If the run kept the lease, open VNC with the report's `crabbox vnc ...` command.
+Stop the lease when done:
+
+```bash
+crabbox stop --provider aws <cbx_id-or-slug>
+```
+
+If Slack login expired, repair it in VNC on a kept lease and rerun with
+`--lease-id`. Do not bake that browser profile into a provider image.
+
+Related docs:
+
+- [QA overview](qa-e2e-automation.md)
+- [Slack channel](../channels/slack.md)
+- [Testing](../help/testing.md)
--- a/docs/concepts/qa-e2e-automation.md
+++ b/docs/concepts/qa-e2e-automation.md
@@ -29,26 +29,26 @@ Current pieces:
 Every QA flow runs under `pnpm openclaw qa <subcommand>`. Many have `pnpm qa:*`
 script aliases; both forms are supported.

-| Command                                             | Purpose                                                                                                                                                                                      |
-| --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `qa run`                                            | Bundled QA self-check; writes a Markdown report.                                                                                                                                             |
-| `qa suite`                                          | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM.                                                       |
-| `qa coverage`                                       | Print the markdown scenario-coverage inventory (`--json` for machine output).                                                                                                                |
-| `qa parity-report`                                  | Compare two `qa-suite-summary.json` files and write the agentic parity report.                                                                                                               |
-| `qa character-eval`                                 | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting).                                                                                 |
-| `qa manual`                                         | Run a one-off prompt against the selected provider/model lane.                                                                                                                               |
-| `qa ui`                                             | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`).                                                                                                                         |
-| `qa docker-build-image`                             | Build the prebaked QA Docker image.                                                                                                                                                          |
-| `qa docker-scaffold`                                | Write a docker-compose scaffold for the QA dashboard + gateway lane.                                                                                                                         |
-| `qa up`                                             | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`).                       |
-| `qa aimock`                                         | Start only the AIMock provider server.                                                                                                                                                       |
-| `qa mock-openai`                                    | Start only the scenario-aware `mock-openai` provider server.                                                                                                                                 |
-| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool.                                                                                                                                                    |
-| `qa matrix`                                         | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix).                                                                                           |
-| `qa telegram`                                       | Live transport lane against a real private Telegram group.                                                                                                                                   |
-| `qa discord`                                        | Live transport lane against a real private Discord guild channel.                                                                                                                            |
-| `qa slack`                                          | Live transport lane against a real private Slack channel.                                                                                                                                    |
-| `qa mantis`                                         | Before and after verification runner for live transport bugs, with Discord status-reactions evidence, Crabbox desktop/browser smoke, and Slack-in-VNC smoke. See [Mantis](/concepts/mantis). |
+| Command                                             | Purpose                                                                                                                                                                                                                                                                 |
+| --------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `qa run`                                            | Bundled QA self-check; writes a Markdown report.                                                                                                                                                                                                                        |
+| `qa suite`                                          | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM.                                                                                                                                  |
+| `qa coverage`                                       | Print the markdown scenario-coverage inventory (`--json` for machine output).                                                                                                                                                                                           |
+| `qa parity-report`                                  | Compare two `qa-suite-summary.json` files and write the agentic parity report.                                                                                                                                                                                          |
+| `qa character-eval`                                 | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting).                                                                                                                                                            |
+| `qa manual`                                         | Run a one-off prompt against the selected provider/model lane.                                                                                                                                                                                                          |
+| `qa ui`                                             | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`).                                                                                                                                                                                                    |
+| `qa docker-build-image`                             | Build the prebaked QA Docker image.                                                                                                                                                                                                                                     |
+| `qa docker-scaffold`                                | Write a docker-compose scaffold for the QA dashboard + gateway lane.                                                                                                                                                                                                    |
+| `qa up`                                             | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`).                                                                                                  |
+| `qa aimock`                                         | Start only the AIMock provider server.                                                                                                                                                                                                                                  |
+| `qa mock-openai`                                    | Start only the scenario-aware `mock-openai` provider server.                                                                                                                                                                                                            |
+| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool.                                                                                                                                                                                                                               |
+| `qa matrix`                                         | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix).                                                                                                                                                                      |
+| `qa telegram`                                       | Live transport lane against a real private Telegram group.                                                                                                                                                                                                              |
+| `qa discord`                                        | Live transport lane against a real private Discord guild channel.                                                                                                                                                                                                       |
+| `qa slack`                                          | Live transport lane against a real private Slack channel.                                                                                                                                                                                                               |
+| `qa mantis`                                         | Before and after verification runner for live transport bugs, with Discord status-reactions evidence, Crabbox desktop/browser smoke, and Slack-in-VNC smoke. See [Mantis](/concepts/mantis) and [Mantis Slack Desktop Runbook](/concepts/mantis-slack-desktop-runbook). |

 ## Operator flow

@@ -149,6 +149,10 @@ With `--gateway-setup`, Mantis leaves a persistent OpenClaw Slack gateway
 running inside the VM on port `38973`; without it, the command runs the normal
 bot-to-bot Slack QA lane and exits after artifact capture.

+The operator checklist, GitHub workflow dispatch command, evidence-comment
+contract, hydrate-mode decision table, timing interpretation, and failure
+handling steps live in [Mantis Slack Desktop Runbook](/concepts/mantis-slack-desktop-runbook).
+
 For an agent/CV style desktop task, run:

 ```bash