feat: add reusable Mantis evidence publishing

This commit is contained in:
Peter Steinberger
2026-05-05 09:01:43 +01:00
parent 5a0d6c7ad8
commit f3d531439b
8 changed files with 1233 additions and 146 deletions

View File

@@ -176,6 +176,72 @@ Crabbox CLI from
`openclaw/crabbox` main so it can use the current desktop/browser lease flags
before the next Crabbox binary release is cut.
`Mantis Scenario` is the generic manual entrypoint. It takes a `scenario_id`,
`candidate_ref`, optional `baseline_ref`, and optional `pr_number`, then
dispatches the scenario-owned workflow. The wrapper is intentionally thin:
scenario workflows still own their transport setup, credentials, VM class,
expected oracle, and artifact manifest.
`Mantis Slack Desktop Smoke` is the first Slack VM workflow. It checks out the
trusted candidate ref in a separate worktree, leases a Crabbox Linux desktop,
runs `pnpm openclaw qa mantis slack-desktop-smoke --gateway-setup` against that
candidate, opens Slack Web in the VNC browser, records the desktop, generates a
motion-trimmed preview with `crabbox media preview`, uploads the full artifact
directory, and optionally posts the inline evidence comment on the target PR.
Use this lane when you want "a Linux desktop with Slack and a claw running"
instead of only a bot-to-bot Slack transcript.
Every PR-publishing scenario writes `mantis-evidence.json` next to its report.
This schema is the handoff between scenario code and GitHub comments:
```json
{
"schemaVersion": 1,
"id": "discord-status-reactions",
"title": "Mantis Discord Status Reactions QA",
"summary": "Human-readable top summary for the PR comment.",
"scenario": "discord-status-reactions-tool-only",
"comparison": {
"baseline": { "sha": "...", "status": "fail", "expected": "queued-only" },
"candidate": { "sha": "...", "status": "pass", "expected": "queued -> thinking -> done" },
"pass": true
},
"artifacts": [
{
"kind": "timeline",
"lane": "baseline",
"label": "Baseline queued-only",
"path": "baseline/timeline.png",
"targetPath": "baseline.png",
"alt": "Baseline Discord timeline",
"width": 420
}
]
}
```
Artifact `path` values are relative to the manifest directory. `targetPath`
values are relative paths under the `qa-artifacts` branch publish directory.
The publisher rejects path traversal and skips entries marked
`"required": false` when optional previews or videos are unavailable.
Supported artifact kinds:
- `timeline`: deterministic scenario screenshot, usually before/after.
- `desktopScreenshot`: VNC/browser desktop screenshot.
- `motionPreview`: inline animated GIF generated from the desktop recording.
- `motionClip`: motion-trimmed MP4 that removes static lead-in and tail.
- `fullVideo`: full MP4 recording for deep inspection.
- `metadata`: JSON/log sidecar.
- `report`: Markdown report.
The reusable publisher is `scripts/mantis/publish-pr-evidence.mjs`. Workflows
call it with the manifest, target PR, `qa-artifacts` target root, comment marker,
Actions artifact URL, run URL, and request source. It copies declared artifacts
to the `qa-artifacts` branch, builds a summary-first PR comment with inline
images/previews and linked videos, then updates the existing marker comment or
creates one.
You can also trigger the status-reactions run directly from a PR comment:
```text