feat(qa): add mantis Slack desktop smoke

This commit is contained in:
Peter Steinberger
2026-05-04 03:47:19 +01:00
parent 471489159b
commit f632f5e60b
8 changed files with 1215 additions and 20 deletions

View File

@@ -111,6 +111,51 @@ Useful desktop smoke flags:
- `--keep-lease` or `OPENCLAW_MANTIS_KEEP_VM=1` keeps a newly created passing lease open for VNC inspection. Failed runs keep the lease by default when one was created so an operator can reconnect.
- `--class`, `--idle-timeout`, and `--ttl` tune machine size and lease lifetime.
The first full desktop transport primitive is the Slack desktop smoke:
```bash
pnpm openclaw qa mantis slack-desktop-smoke \
--output-dir .artifacts/qa-e2e/mantis/slack-desktop \
--gateway-setup \
--scenario slack-canary \
--keep-lease
```
It leases or reuses a Crabbox desktop machine, syncs the current checkout into
the VM, runs `pnpm openclaw qa slack` inside that VM, opens Slack Web in the VNC
browser, captures the visible desktop, and copies both the Slack QA artifacts and
the VNC screenshot back to the local output directory. This is the first Mantis
shape where the SUT OpenClaw gateway and the browser both live inside the same
Linux desktop VM.
With `--gateway-setup`, the command prepares a persistent disposable OpenClaw
home at `$HOME/.openclaw-mantis/slack-openclaw`, patches Slack Socket Mode
configuration for the selected channel, starts `openclaw gateway run` on port
`38973`, and keeps Chrome running in the VNC session. This is the "leave me a
Linux desktop with Slack and a claw running" mode; the bot-to-bot Slack QA lane
remains the default when `--gateway-setup` is omitted.
Required inputs for `--credential-source env`:
- `OPENCLAW_QA_SLACK_CHANNEL_ID`
- `OPENCLAW_QA_SLACK_DRIVER_BOT_TOKEN`
- `OPENCLAW_QA_SLACK_SUT_BOT_TOKEN`
- `OPENCLAW_QA_SLACK_SUT_APP_TOKEN`
- `OPENCLAW_LIVE_OPENAI_KEY` for the remote model lane. If only
`OPENAI_API_KEY` is set locally, Mantis maps it to `OPENCLAW_LIVE_OPENAI_KEY`
before invoking Crabbox so Crabbox's `OPENCLAW_*` env forwarding can carry it
into the VM.
Useful Slack desktop flags:
- `--lease-id <cbx_...>` reruns against a machine where an operator already logged in to Slack Web through VNC.
- `--gateway-setup` starts a persistent OpenClaw Slack gateway in the VM instead of only running the bot-to-bot QA lane.
- `--slack-url <url>` opens a specific Slack Web URL. Without it, Mantis derives `https://app.slack.com/client/<team>/<channel>` from Slack `auth.test` when the SUT bot token is available.
- `--slack-channel-id <id>` controls the Slack channel allowlist used by gateway setup.
- `OPENCLAW_MANTIS_SLACK_BROWSER_PROFILE_DIR` controls the persistent Chrome profile inside the VM. The default is `$HOME/.config/openclaw-mantis/slack-chrome-profile`, so a manual Slack Web login survives reruns on the same lease.
- `--credential-source convex --credential-role ci` uses the shared credential pool instead of direct Slack env tokens.
- `--provider-mode`, `--model`, `--alt-model`, and `--fast` pass through to the Slack live lane.
The GitHub smoke workflow is `Mantis Discord Smoke`. The before and after GitHub
workflow for the first real scenario is `Mantis Discord Status Reactions`. It
accepts:

View File

@@ -29,26 +29,26 @@ Current pieces:
Every QA flow runs under `pnpm openclaw qa <subcommand>`. Many have `pnpm qa:*`
script aliases; both forms are supported.
| Command | Purpose |
| --------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `qa run` | Bundled QA self-check; writes a Markdown report. |
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity report. |
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
| `qa ui` | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`). |
| `qa docker-build-image` | Build the prebaked QA Docker image. |
| `qa docker-scaffold` | Write a docker-compose scaffold for the QA dashboard + gateway lane. |
| `qa up` | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`). |
| `qa aimock` | Start only the AIMock provider server. |
| `qa mock-openai` | Start only the scenario-aware `mock-openai` provider server. |
| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool. |
| `qa matrix` | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix). |
| `qa telegram` | Live transport lane against a real private Telegram group. |
| `qa discord` | Live transport lane against a real private Discord guild channel. |
| `qa slack` | Live transport lane against a real private Slack channel. |
| `qa mantis` | Before and after verification runner for live transport bugs, with Discord status-reactions evidence and a Crabbox desktop/browser smoke. See [Mantis](/concepts/mantis). |
| Command | Purpose |
| --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `qa run` | Bundled QA self-check; writes a Markdown report. |
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity report. |
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
| `qa ui` | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`). |
| `qa docker-build-image` | Build the prebaked QA Docker image. |
| `qa docker-scaffold` | Write a docker-compose scaffold for the QA dashboard + gateway lane. |
| `qa up` | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`). |
| `qa aimock` | Start only the AIMock provider server. |
| `qa mock-openai` | Start only the scenario-aware `mock-openai` provider server. |
| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool. |
| `qa matrix` | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix). |
| `qa telegram` | Live transport lane against a real private Telegram group. |
| `qa discord` | Live transport lane against a real private Discord guild channel. |
| `qa slack` | Live transport lane against a real private Slack channel. |
| `qa mantis` | Before and after verification runner for live transport bugs, with Discord status-reactions evidence, Crabbox desktop/browser smoke, and Slack-in-VNC smoke. See [Mantis](/concepts/mantis). |
## Operator flow
@@ -121,6 +121,23 @@ pnpm openclaw qa slack
They target a pre-existing real channel with two bots (driver + SUT). Required env vars, scenario lists, output artifacts, and the Convex credential pool are documented in [Telegram, Discord, and Slack QA reference](#telegram-discord-and-slack-qa-reference) below.
For a full Slack desktop VM run with VNC rescue, run:
```bash
pnpm openclaw qa mantis slack-desktop-smoke \
--gateway-setup \
--scenario slack-canary \
--keep-lease
```
That command leases a Crabbox desktop/browser machine, runs the Slack live lane
inside the VM, opens Slack Web in the VNC browser, captures the desktop, and
copies `slack-qa/` plus `slack-desktop-smoke.png` back to the Mantis artifact
directory. Reuse `--lease-id <cbx_...>` after logging in to Slack Web manually
through VNC. With `--gateway-setup`, Mantis leaves a persistent OpenClaw Slack
gateway running inside the VM on port `38973`; without it, the command runs the
normal bot-to-bot Slack QA lane and exits after artifact capture.
Before using pooled live credentials, run:
```bash