mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 15:10:52 +00:00
feat(security): add GHSA detector-review pipeline and OpenGrep CI workflows (#69483)
* feat(security): add GHSA detector-review pipeline and OpenGrep CI workflows [AI-assisted]
Stand up an end-to-end pipeline that turns every published openclaw GitHub
Security Advisory into a reusable OpenGrep rule, and wire the compiled rules
into manual-dispatch GitHub Actions workflows that publish SARIF to GitHub
Code Scanning.
The pipeline is harness-agnostic: any coding-agent CLI (Rovo Dev, Claude
Code, Codex, OpenCode, or anything you can shell out to) can drive it via
the runner script's --harness flag. Built-in adapters cover the four common
harnesses; --harness-cmd '<template>' supports anything else with shell-style
{prompt}/{model}/{output_file} substitution.
Pipeline pieces:
- scripts/run-ghsa-detector-review-batch.mjs runs your chosen coding harness
in parallel against every advisory using the agent-agnostic detector-review
spec at security/detector-review/detector-review-spec.md. Each case
produces an opengrep general-rule.yml (precise) and broad-rule.yml
(review-aid), plus a coverage-validated report against the vulnerable
commit's changed files.
- scripts/compile-opengrep-rules.mjs walks a run directory, rewrites each
rule's id to ghsa-detector.<ghsa>.<orig-id>, injects ghsa/advisory-url/
detector-bucket/source-rule-id metadata, and uses opengrep itself to drop
rules with InvalidRuleSchemaError so the published super-configs load
cleanly.
Compiled outputs:
- security/opengrep/precise.yml (336 rules)
- security/opengrep/broad.yml (459 rules)
- security/opengrep/compile-manifest.json (per-rule provenance map)
CI workflows (manual workflow_dispatch only):
- .github/workflows/opengrep-precise.yml
- .github/workflows/opengrep-broad.yml
Both install a pinned opengrep, run opengrep scan against src/, upload SARIF
to Code Scanning under categories opengrep-precise / opengrep-broad, and use
continue-on-error: true so findings never block the workflow.
Detector-review spec and assets:
- security/detector-review/detector-review-spec.md the agent-agnostic spec
the runner injects into each per-case prompt
- security/detector-review/references/{detector-rubric,report-template}.md
- security/detector-review/scripts/init_case.py
- security/prompt-suffix-coverage-first.md mandatory prompt addendum that
enforces coverage-first validation (rule must catch the OG vuln, not just
pass synthetic fixtures)
Docs:
- security/README.md end-to-end flow, supported harnesses, regen recipe
- security/opengrep/README.md compiled-config details + recompile recipe
* security: tighten GHSA OpenGrep detector workflow
* chore: refine precise opengrep workflow
* chore: remove stale opengrep metadata
* fix: harden GHSA OpenGrep workflow
* ci: split OpenGrep diff and full scans
* chore: remove performance-only opengrep rule
* ci: use OpenGrep installer path
* chore: enforce opengrep rule metadata provenance
* chore: generalize opengrep rule compilation
* docs: align opengrep rulepack guidance
* chore: support generic opengrep rule sources
* fix: validate opengrep rulepack-only changes
---------
Co-authored-by: Jesse Merhi <security-engineering@atlassian.com>
This commit is contained in:
103
security/opengrep/README.md
Normal file
103
security/opengrep/README.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# Compiled OpenGrep super-configs
|
||||
|
||||
`precise.yml` is OpenClaw's shipped precise OpenGrep rulepack. Each rule is tied
|
||||
to a source advisory, vulnerability report, or review identifier through metadata
|
||||
and is intended to have concrete coverage of the original vulnerable behavior or
|
||||
a verified variant.
|
||||
|
||||
Rule provenance lives in each compiled rule's metadata; no separate manifest is
|
||||
committed or generated by default.
|
||||
|
||||
Noisy exploratory rules are intentionally kept out of the tracked repo. Anything
|
||||
appended to `precise.yml` must be low-noise enough to run as a blocking PR-diff
|
||||
check and as a manual full-repository audit.
|
||||
|
||||
## Editing rules
|
||||
|
||||
`precise.yml` is the checked-in compiled rulepack. Prefer changing source rule
|
||||
YAML and rerunning `security/opengrep/compile-rules.mjs` instead of hand-editing
|
||||
compiled rules. The compiler appends new rule IDs by default; use
|
||||
`--replace-precise` only when intentionally rebuilding the rulepack from a
|
||||
complete source folder. Direct edits are discouraged because they can bypass ID,
|
||||
metadata, duplicate, and OpenGrep validation.
|
||||
|
||||
## Rule naming and metadata
|
||||
|
||||
Every rule's id is rewritten to `<source-id>.<original-id>`. Every rule's
|
||||
`metadata` block is augmented with source fields enforced by
|
||||
`pnpm check:opengrep-rule-metadata`:
|
||||
|
||||
| Key | Value |
|
||||
| ----------------- | --------------------------------------------------------------------- |
|
||||
| `ghsa` | `GHSA-xxxx-xxxx-xxxx` for GHSA-backed rules |
|
||||
| `advisory-id` | non-GHSA source identifier, or the GHSA ID normalized by the compiler |
|
||||
| `advisory-url` | durable URL to the advisory, report, review record, or source context |
|
||||
| `detector-bucket` | `precise` |
|
||||
| `source-rule-id` | the original source rule id |
|
||||
| `source-file` | optional source YAML file used during compilation |
|
||||
|
||||
## Recompiling
|
||||
|
||||
```bash
|
||||
# from the openclaw repo root
|
||||
node security/opengrep/compile-rules.mjs \
|
||||
--rules-dir <folder-with-source-rule-yaml>
|
||||
```
|
||||
|
||||
The script:
|
||||
|
||||
1. Recursively walks every `.yml` / `.yaml` file under `--rules-dir`
|
||||
2. Reads top-level `rules` arrays from those source files
|
||||
3. Requires each source rule to provide `metadata.ghsa` or `metadata.advisory-id`
|
||||
4. Requires `metadata.advisory-url` for non-GHSA source identifiers
|
||||
5. Rewrites ids and injects metadata as above
|
||||
6. Appends only new precise rule ids to the existing `precise.yml` by default; pass `--replace-precise` to rebuild it from just the supplied source folder
|
||||
7. Runs `opengrep scan --no-strict` against an empty target to identify schema-invalid or parser-invalid rules and drops mapped bad rules so the published super-config loads cleanly
|
||||
8. Writes `precise.yml`
|
||||
|
||||
Skipped, duplicate, or invalid rules are summarized on stdout/stderr for follow-up.
|
||||
|
||||
## Validating locally
|
||||
|
||||
```bash
|
||||
pnpm check:opengrep-rule-metadata
|
||||
opengrep validate security/opengrep/precise.yml
|
||||
```
|
||||
|
||||
The metadata check must pass before rules are committed. OpenGrep validation must
|
||||
exit zero. Warnings about unknown fields are acceptable only when OpenGrep still
|
||||
reports `Configuration is valid` and a non-zero rule count. The compile script
|
||||
drops mapped schema/parser-invalid rules and fails closed when OpenGrep
|
||||
validation itself cannot be completed.
|
||||
|
||||
## Running locally
|
||||
|
||||
```bash
|
||||
scripts/run-opengrep.sh
|
||||
```
|
||||
|
||||
For SARIF output matching the PR workflow's diff-scoped scan:
|
||||
|
||||
```bash
|
||||
scripts/run-opengrep.sh --changed --sarif
|
||||
```
|
||||
|
||||
For SARIF output matching the manual full-repository workflow:
|
||||
|
||||
```bash
|
||||
scripts/run-opengrep.sh --sarif
|
||||
```
|
||||
|
||||
## Why `--no-strict`?
|
||||
|
||||
Some generated rules trigger non-fatal opengrep warnings (for example,
|
||||
unknown-field warnings on compatibility-only keys). `--no-strict` keeps
|
||||
opengrep's exit code clean for those warnings. Parser-invalid rules are still
|
||||
dropped during compilation so the checked-in super-config validates before CI
|
||||
uses it.
|
||||
|
||||
## Why `--no-git-ignore`?
|
||||
|
||||
Some OpenClaw paths are excluded by `.gitignore` for build reasons even though
|
||||
they contain meaningful source code we want scanned. `--no-git-ignore` keeps
|
||||
opengrep from skipping them.
|
||||
Reference in New Issue
Block a user