mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 05:50:43 +00:00
docs: add Mantis Slack desktop runbook
This commit is contained in:
202
docs/concepts/mantis-slack-desktop-runbook.md
Normal file
202
docs/concepts/mantis-slack-desktop-runbook.md
Normal file
@@ -0,0 +1,202 @@
|
||||
---
|
||||
summary: "Operator runbook for Mantis Slack desktop QA: GitHub dispatch, local CLI, warm VNC leases, hydrate modes, timing interpretation, artifacts, and failure handling."
|
||||
read_when:
|
||||
- Running Mantis Slack desktop QA from GitHub or locally
|
||||
- Debugging slow Mantis Slack desktop runs
|
||||
- Choosing source, prehydrated, or warm-lease mode
|
||||
- Posting screenshot and video evidence to a PR
|
||||
title: "Mantis Slack Desktop Runbook"
|
||||
---
|
||||
|
||||
Mantis Slack desktop QA is the real-UI lane for Slack-class bugs that need a
|
||||
Linux desktop, VNC rescue, Slack Web, a real OpenClaw gateway, screenshots,
|
||||
videos, and a PR evidence comment.
|
||||
|
||||
Use it when unit tests or the headless Slack live lane cannot prove the bug.
|
||||
|
||||
## Storage Model
|
||||
|
||||
Mantis uses three different storage layers:
|
||||
|
||||
- Provider image: owned by Crabbox and stored in the cloud provider account.
|
||||
It contains machine capabilities such as Chrome/Chromium, ffmpeg, scrot,
|
||||
Node/corepack/pnpm, native build tools, and empty cache directories.
|
||||
- Warm lease state: owned by the current operator session. It can contain a
|
||||
logged-in browser profile, `/var/cache/crabbox/pnpm`, and a prepared source
|
||||
checkout while the lease is alive.
|
||||
- Mantis artifacts: owned by the OpenClaw run. They live under
|
||||
`.artifacts/qa-e2e/mantis/...`, then GitHub Actions uploads them and the
|
||||
Mantis GitHub App comments inline evidence on the PR.
|
||||
|
||||
Never put secrets, browser cookies, Slack login state, repository checkouts,
|
||||
`node_modules`, or `dist/` into a prebaked provider image.
|
||||
|
||||
## GitHub Dispatch
|
||||
|
||||
Run the workflow from `main`:
|
||||
|
||||
```bash
|
||||
gh workflow run mantis-slack-desktop-smoke.yml \
|
||||
--ref main \
|
||||
-f candidate_ref=<trusted-ref-or-sha> \
|
||||
-f pr_number=<pr-number> \
|
||||
-f scenario_id=slack-canary \
|
||||
-f crabbox_provider=aws \
|
||||
-f keep_vm=false \
|
||||
-f hydrate_mode=source
|
||||
```
|
||||
|
||||
Allowed `candidate_ref` values are intentionally narrow because the workflow
|
||||
uses live credentials: current `main` ancestry, release tags, or an open PR head
|
||||
from `openclaw/openclaw`.
|
||||
|
||||
The workflow writes:
|
||||
|
||||
- uploaded artifact: `mantis-slack-desktop-smoke-<run-id>-<attempt>`;
|
||||
- inline PR comment from the Mantis GitHub App;
|
||||
- `slack-desktop-smoke.png`;
|
||||
- `slack-desktop-smoke.mp4`;
|
||||
- `slack-desktop-smoke-preview.gif`;
|
||||
- `slack-desktop-smoke-change.mp4`;
|
||||
- `mantis-slack-desktop-smoke-summary.json`;
|
||||
- `mantis-slack-desktop-smoke-report.md`;
|
||||
- remote logs such as `slack-desktop-command.log`, `openclaw-gateway.log`,
|
||||
`chrome.log`, and `ffmpeg.log`.
|
||||
|
||||
The PR comment is updated in place by the hidden
|
||||
`<!-- mantis-slack-desktop-smoke -->` marker.
|
||||
|
||||
## Local CLI
|
||||
|
||||
Cold source proof:
|
||||
|
||||
```bash
|
||||
pnpm openclaw qa mantis slack-desktop-smoke \
|
||||
--provider aws \
|
||||
--class standard \
|
||||
--gateway-setup \
|
||||
--credential-source convex \
|
||||
--credential-role maintainer \
|
||||
--provider-mode live-frontier \
|
||||
--model openai/gpt-5.4 \
|
||||
--alt-model openai/gpt-5.4 \
|
||||
--scenario slack-canary \
|
||||
--hydrate-mode source
|
||||
```
|
||||
|
||||
Keep the VM for VNC rescue:
|
||||
|
||||
```bash
|
||||
pnpm openclaw qa mantis slack-desktop-smoke \
|
||||
--provider aws \
|
||||
--class standard \
|
||||
--gateway-setup \
|
||||
--scenario slack-canary \
|
||||
--keep-lease
|
||||
```
|
||||
|
||||
Open VNC:
|
||||
|
||||
```bash
|
||||
crabbox vnc --provider aws --id <cbx_id> --open
|
||||
```
|
||||
|
||||
Reuse a warm lease:
|
||||
|
||||
```bash
|
||||
pnpm openclaw qa mantis slack-desktop-smoke \
|
||||
--provider aws \
|
||||
--lease-id <cbx_id-or-slug> \
|
||||
--gateway-setup \
|
||||
--scenario slack-canary \
|
||||
--hydrate-mode source
|
||||
```
|
||||
|
||||
Use `--hydrate-mode prehydrated` only when the reused remote workspace already
|
||||
has `node_modules` and a built `dist/`. Mantis fails closed if those are
|
||||
missing.
|
||||
|
||||
## Hydrate Modes
|
||||
|
||||
| Mode | Use when | Remote behavior | Tradeoff |
|
||||
| ------------- | ----------------------------------------- | ------------------------------------------------------------------------------------- | -------------------------------------------------------- |
|
||||
| `source` | Normal PR proof, cold machines, CI | Runs `pnpm install --frozen-lockfile --prefer-offline` and `pnpm build` inside the VM | Slowest, strongest source-checkout proof |
|
||||
| `prehydrated` | You intentionally prepared a reused lease | Requires existing `node_modules` and `dist/`; skips install/build | Fast, but only valid for operator-controlled warm leases |
|
||||
|
||||
GitHub Actions always prepares the candidate checkout before the VM run. Its
|
||||
pnpm store is cached by OS, Node version, and lockfile. The VM source run also
|
||||
uses `/var/cache/crabbox/pnpm` when present.
|
||||
|
||||
## Timing Interpretation
|
||||
|
||||
`mantis-slack-desktop-smoke-report.md` includes phase timings:
|
||||
|
||||
- `crabbox.warmup`: cloud provider boot, desktop/browser readiness, and SSH.
|
||||
- `crabbox.inspect`: lease metadata lookup.
|
||||
- `credentials.prepare`: Convex credential lease acquisition.
|
||||
- `crabbox.remote_run`: sync, browser launch, OpenClaw install/build or
|
||||
hydrate validation, gateway startup, screenshot, and video capture.
|
||||
- `artifacts.copy`: rsync back from the VM.
|
||||
|
||||
`crabbox.remote_run` can be marked `accepted` when Crabbox returns a non-zero
|
||||
remote status after Mantis has copied metadata proving that the OpenClaw gateway
|
||||
is alive and the setup completed. Treat `accepted` as pass-with-explanation,
|
||||
not a failed scenario.
|
||||
|
||||
If the run is slow:
|
||||
|
||||
- warmup dominates: prebake or promote a better Crabbox provider image;
|
||||
- remote_run dominates in `source`: use a warm lease, improve pnpm store reuse,
|
||||
or move machine prerequisites into the provider image;
|
||||
- remote_run dominates in `prehydrated`: the remote workspace was not actually
|
||||
ready, or the gateway/browser/Slack setup is slow;
|
||||
- artifact copy dominates: inspect video size and artifact directory contents.
|
||||
|
||||
## Evidence Checklist
|
||||
|
||||
A good PR comment should show:
|
||||
|
||||
- scenario id and candidate SHA;
|
||||
- GitHub Actions run URL;
|
||||
- artifact URL;
|
||||
- inline screenshot;
|
||||
- inline animated preview when available;
|
||||
- full MP4 and trimmed MP4 links;
|
||||
- pass/fail status;
|
||||
- timing summary in the attached report.
|
||||
|
||||
Do not commit screenshots or videos into the repository. Keep them in GitHub
|
||||
Actions artifacts or the PR comment.
|
||||
|
||||
## Failure Handling
|
||||
|
||||
If the workflow fails before the VM run, inspect the Actions job first. Typical
|
||||
causes are untrusted `candidate_ref`, missing environment secrets, or candidate
|
||||
install/build failure.
|
||||
|
||||
If the VM run fails but screenshots were copied back, inspect:
|
||||
|
||||
```bash
|
||||
cat mantis-slack-desktop-smoke-report.md
|
||||
cat mantis-slack-desktop-smoke-summary.json
|
||||
cat slack-desktop-command.log
|
||||
cat openclaw-gateway.log
|
||||
cat chrome.log
|
||||
cat ffmpeg.log
|
||||
```
|
||||
|
||||
If the run kept the lease, open VNC with the report's `crabbox vnc ...` command.
|
||||
Stop the lease when done:
|
||||
|
||||
```bash
|
||||
crabbox stop --provider aws <cbx_id-or-slug>
|
||||
```
|
||||
|
||||
If Slack login expired, repair it in VNC on a kept lease and rerun with
|
||||
`--lease-id`. Do not bake that browser profile into a provider image.
|
||||
|
||||
Related docs:
|
||||
|
||||
- [QA overview](qa-e2e-automation.md)
|
||||
- [Slack channel](../channels/slack.md)
|
||||
- [Testing](../help/testing.md)
|
||||
@@ -29,26 +29,26 @@ Current pieces:
|
||||
Every QA flow runs under `pnpm openclaw qa <subcommand>`. Many have `pnpm qa:*`
|
||||
script aliases; both forms are supported.
|
||||
|
||||
| Command | Purpose |
|
||||
| --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `qa run` | Bundled QA self-check; writes a Markdown report. |
|
||||
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
|
||||
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
|
||||
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity report. |
|
||||
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
|
||||
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
|
||||
| `qa ui` | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`). |
|
||||
| `qa docker-build-image` | Build the prebaked QA Docker image. |
|
||||
| `qa docker-scaffold` | Write a docker-compose scaffold for the QA dashboard + gateway lane. |
|
||||
| `qa up` | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`). |
|
||||
| `qa aimock` | Start only the AIMock provider server. |
|
||||
| `qa mock-openai` | Start only the scenario-aware `mock-openai` provider server. |
|
||||
| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool. |
|
||||
| `qa matrix` | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix). |
|
||||
| `qa telegram` | Live transport lane against a real private Telegram group. |
|
||||
| `qa discord` | Live transport lane against a real private Discord guild channel. |
|
||||
| `qa slack` | Live transport lane against a real private Slack channel. |
|
||||
| `qa mantis` | Before and after verification runner for live transport bugs, with Discord status-reactions evidence, Crabbox desktop/browser smoke, and Slack-in-VNC smoke. See [Mantis](/concepts/mantis). |
|
||||
| Command | Purpose |
|
||||
| --------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `qa run` | Bundled QA self-check; writes a Markdown report. |
|
||||
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
|
||||
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
|
||||
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity report. |
|
||||
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
|
||||
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
|
||||
| `qa ui` | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`). |
|
||||
| `qa docker-build-image` | Build the prebaked QA Docker image. |
|
||||
| `qa docker-scaffold` | Write a docker-compose scaffold for the QA dashboard + gateway lane. |
|
||||
| `qa up` | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`). |
|
||||
| `qa aimock` | Start only the AIMock provider server. |
|
||||
| `qa mock-openai` | Start only the scenario-aware `mock-openai` provider server. |
|
||||
| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool. |
|
||||
| `qa matrix` | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix). |
|
||||
| `qa telegram` | Live transport lane against a real private Telegram group. |
|
||||
| `qa discord` | Live transport lane against a real private Discord guild channel. |
|
||||
| `qa slack` | Live transport lane against a real private Slack channel. |
|
||||
| `qa mantis` | Before and after verification runner for live transport bugs, with Discord status-reactions evidence, Crabbox desktop/browser smoke, and Slack-in-VNC smoke. See [Mantis](/concepts/mantis) and [Mantis Slack Desktop Runbook](/concepts/mantis-slack-desktop-runbook). |
|
||||
|
||||
## Operator flow
|
||||
|
||||
@@ -149,6 +149,10 @@ With `--gateway-setup`, Mantis leaves a persistent OpenClaw Slack gateway
|
||||
running inside the VM on port `38973`; without it, the command runs the normal
|
||||
bot-to-bot Slack QA lane and exits after artifact capture.
|
||||
|
||||
The operator checklist, GitHub workflow dispatch command, evidence-comment
|
||||
contract, hydrate-mode decision table, timing interpretation, and failure
|
||||
handling steps live in [Mantis Slack Desktop Runbook](/concepts/mantis-slack-desktop-runbook).
|
||||
|
||||
For an agent/CV style desktop task, run:
|
||||
|
||||
```bash
|
||||
|
||||
Reference in New Issue
Block a user