mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 11:30:43 +00:00
feat(qa): add Mantis desktop browser smoke
This commit is contained in:
@@ -89,6 +89,27 @@ directory, installs dependencies, builds each ref, runs the scenario with
|
||||
and `mantis-report.md`. For the first Discord scenario, a successful verification
|
||||
means baseline status is `fail` and candidate status is `pass`.
|
||||
|
||||
The first VM/browser primitive is the desktop smoke:
|
||||
|
||||
```bash
|
||||
pnpm openclaw qa mantis desktop-browser-smoke \
|
||||
--output-dir .artifacts/qa-e2e/mantis/desktop-browser
|
||||
```
|
||||
|
||||
It leases or reuses a Crabbox desktop machine, starts a visible browser inside the
|
||||
VNC session, captures the desktop, pulls artifacts back to the local output
|
||||
directory, and writes the reconnect command into the report. The command defaults
|
||||
to the Hetzner provider because it is the first provider with working desktop/VNC
|
||||
coverage in the Mantis lane. Override it with `--provider`, `--crabbox-bin`, or
|
||||
`OPENCLAW_MANTIS_CRABBOX_PROVIDER` when running against another Crabbox fleet.
|
||||
|
||||
Useful desktop smoke flags:
|
||||
|
||||
- `--lease-id <cbx_...>` or `OPENCLAW_MANTIS_CRABBOX_LEASE_ID` reuses a warmed desktop.
|
||||
- `--browser-url <url>` changes the page opened in the visible browser.
|
||||
- `--keep-lease` or `OPENCLAW_MANTIS_KEEP_VM=1` keeps a newly created passing lease open for VNC inspection. Failed runs keep the lease by default when one was created so an operator can reconnect.
|
||||
- `--class`, `--idle-timeout`, and `--ttl` tune machine size and lease lifetime.
|
||||
|
||||
The GitHub smoke workflow is `Mantis Discord Smoke`. The before and after GitHub
|
||||
workflow for the first real scenario is `Mantis Discord Status Reactions`. It
|
||||
accepts:
|
||||
@@ -132,18 +153,19 @@ ClawSweeper review findings.
|
||||
|
||||
1. Acquire credentials.
|
||||
2. Allocate or reuse a VM.
|
||||
3. Prepare a clean checkout for the baseline ref.
|
||||
4. Install dependencies and build only what the scenario needs.
|
||||
5. Start a child OpenClaw Gateway with an isolated state directory.
|
||||
6. Configure the live transport, provider, model, and browser profile.
|
||||
7. Run the scenario and capture baseline evidence.
|
||||
8. Stop the gateway and preserve logs.
|
||||
9. Prepare the candidate ref in the same VM.
|
||||
10. Run the same scenario and capture candidate evidence.
|
||||
11. Compare the oracle results and visual evidence.
|
||||
12. Write Markdown, JSON, logs, screenshots, and optional trace artifacts.
|
||||
13. Upload GitHub Actions artifacts.
|
||||
14. Post a concise PR or Discord status message.
|
||||
3. Prepare the desktop/browser profile when the scenario needs UI evidence.
|
||||
4. Prepare a clean checkout for the baseline ref.
|
||||
5. Install dependencies and build only what the scenario needs.
|
||||
6. Start a child OpenClaw Gateway with an isolated state directory.
|
||||
7. Configure the live transport, provider, model, and browser profile.
|
||||
8. Run the scenario and capture baseline evidence.
|
||||
9. Stop the gateway and preserve logs.
|
||||
10. Prepare the candidate ref in the same VM.
|
||||
11. Run the same scenario and capture candidate evidence.
|
||||
12. Compare the oracle results and visual evidence.
|
||||
13. Write Markdown, JSON, logs, screenshots, and optional trace artifacts.
|
||||
14. Upload GitHub Actions artifacts.
|
||||
15. Post a concise PR or Discord status message.
|
||||
|
||||
The scenario should be able to fail in two different ways:
|
||||
|
||||
|
||||
@@ -29,26 +29,26 @@ Current pieces:
|
||||
Every QA flow runs under `pnpm openclaw qa <subcommand>`. Many have `pnpm qa:*`
|
||||
script aliases; both forms are supported.
|
||||
|
||||
| Command | Purpose |
|
||||
| --------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `qa run` | Bundled QA self-check; writes a Markdown report. |
|
||||
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
|
||||
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
|
||||
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity report. |
|
||||
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
|
||||
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
|
||||
| `qa ui` | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`). |
|
||||
| `qa docker-build-image` | Build the prebaked QA Docker image. |
|
||||
| `qa docker-scaffold` | Write a docker-compose scaffold for the QA dashboard + gateway lane. |
|
||||
| `qa up` | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`). |
|
||||
| `qa aimock` | Start only the AIMock provider server. |
|
||||
| `qa mock-openai` | Start only the scenario-aware `mock-openai` provider server. |
|
||||
| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool. |
|
||||
| `qa matrix` | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix). |
|
||||
| `qa telegram` | Live transport lane against a real private Telegram group. |
|
||||
| `qa discord` | Live transport lane against a real private Discord guild channel. |
|
||||
| `qa slack` | Live transport lane against a real private Slack channel. |
|
||||
| `qa mantis` | Before and after verification runner for live transport bugs, with the first Discord status-reactions scenario. See [Mantis](/concepts/mantis). |
|
||||
| Command | Purpose |
|
||||
| --------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `qa run` | Bundled QA self-check; writes a Markdown report. |
|
||||
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
|
||||
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
|
||||
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity report. |
|
||||
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
|
||||
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
|
||||
| `qa ui` | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`). |
|
||||
| `qa docker-build-image` | Build the prebaked QA Docker image. |
|
||||
| `qa docker-scaffold` | Write a docker-compose scaffold for the QA dashboard + gateway lane. |
|
||||
| `qa up` | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`). |
|
||||
| `qa aimock` | Start only the AIMock provider server. |
|
||||
| `qa mock-openai` | Start only the scenario-aware `mock-openai` provider server. |
|
||||
| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool. |
|
||||
| `qa matrix` | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix). |
|
||||
| `qa telegram` | Live transport lane against a real private Telegram group. |
|
||||
| `qa discord` | Live transport lane against a real private Discord guild channel. |
|
||||
| `qa slack` | Live transport lane against a real private Slack channel. |
|
||||
| `qa mantis` | Before and after verification runner for live transport bugs, with Discord status-reactions evidence and a Crabbox desktop/browser smoke. See [Mantis](/concepts/mantis). |
|
||||
|
||||
## Operator flow
|
||||
|
||||
|
||||
Reference in New Issue
Block a user