mirror of
https://github.com/openclaw/openclaw.git
synced 2026-05-06 17:00:50 +00:00
feat(security): add GHSA detector-review pipeline and OpenGrep CI workflows (#69483)
* feat(security): add GHSA detector-review pipeline and OpenGrep CI workflows [AI-assisted]
Stand up an end-to-end pipeline that turns every published openclaw GitHub
Security Advisory into a reusable OpenGrep rule, and wire the compiled rules
into manual-dispatch GitHub Actions workflows that publish SARIF to GitHub
Code Scanning.
The pipeline is harness-agnostic: any coding-agent CLI (Rovo Dev, Claude
Code, Codex, OpenCode, or anything you can shell out to) can drive it via
the runner script's --harness flag. Built-in adapters cover the four common
harnesses; --harness-cmd '<template>' supports anything else with shell-style
{prompt}/{model}/{output_file} substitution.
Pipeline pieces:
- scripts/run-ghsa-detector-review-batch.mjs runs your chosen coding harness
in parallel against every advisory using the agent-agnostic detector-review
spec at security/detector-review/detector-review-spec.md. Each case
produces an opengrep general-rule.yml (precise) and broad-rule.yml
(review-aid), plus a coverage-validated report against the vulnerable
commit's changed files.
- scripts/compile-opengrep-rules.mjs walks a run directory, rewrites each
rule's id to ghsa-detector.<ghsa>.<orig-id>, injects ghsa/advisory-url/
detector-bucket/source-rule-id metadata, and uses opengrep itself to drop
rules with InvalidRuleSchemaError so the published super-configs load
cleanly.
Compiled outputs:
- security/opengrep/precise.yml (336 rules)
- security/opengrep/broad.yml (459 rules)
- security/opengrep/compile-manifest.json (per-rule provenance map)
CI workflows (manual workflow_dispatch only):
- .github/workflows/opengrep-precise.yml
- .github/workflows/opengrep-broad.yml
Both install a pinned opengrep, run opengrep scan against src/, upload SARIF
to Code Scanning under categories opengrep-precise / opengrep-broad, and use
continue-on-error: true so findings never block the workflow.
Detector-review spec and assets:
- security/detector-review/detector-review-spec.md the agent-agnostic spec
the runner injects into each per-case prompt
- security/detector-review/references/{detector-rubric,report-template}.md
- security/detector-review/scripts/init_case.py
- security/prompt-suffix-coverage-first.md mandatory prompt addendum that
enforces coverage-first validation (rule must catch the OG vuln, not just
pass synthetic fixtures)
Docs:
- security/README.md end-to-end flow, supported harnesses, regen recipe
- security/opengrep/README.md compiled-config details + recompile recipe
* security: tighten GHSA OpenGrep detector workflow
* chore: refine precise opengrep workflow
* chore: remove stale opengrep metadata
* fix: harden GHSA OpenGrep workflow
* ci: split OpenGrep diff and full scans
* chore: remove performance-only opengrep rule
* ci: use OpenGrep installer path
* chore: enforce opengrep rule metadata provenance
* chore: generalize opengrep rule compilation
* docs: align opengrep rulepack guidance
* chore: support generic opengrep rule sources
* fix: validate opengrep rulepack-only changes
---------
Co-authored-by: Jesse Merhi <security-engineering@atlassian.com>
This commit is contained in:
136
security/README.md
Normal file
136
security/README.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# Security tooling
|
||||
|
||||
This directory holds OpenClaw's shipped OpenGrep security rulepack and the
|
||||
supporting tooling that validates and runs it. Maintainer-only advisory triage
|
||||
and detector-generation prompts live outside the public repo; this repo keeps the
|
||||
durable artifacts needed to block regressions in PRs and support local rule
|
||||
validation.
|
||||
|
||||
## Layout
|
||||
|
||||
```text
|
||||
security/
|
||||
├── README.md <- this file
|
||||
└── opengrep/
|
||||
├── README.md <- precise rulepack details + compile recipe
|
||||
└── precise.yml <- compiled super-config: precise rules
|
||||
```
|
||||
|
||||
The related scripts are:
|
||||
|
||||
- `security/opengrep/compile-rules.mjs` — gathers source OpenGrep rule YAMLs from
|
||||
a folder and appends new compiled rule IDs to `security/opengrep/precise.yml`.
|
||||
- `security/opengrep/check-rule-metadata.mjs` — enforces that every committed
|
||||
rule carries durable source/provenance metadata.
|
||||
- `scripts/run-opengrep.sh` — runs the compiled precise rulepack locally or in
|
||||
CI with consistent paths and exclusions.
|
||||
|
||||
## Rule lifecycle
|
||||
|
||||
Maintainers investigate advisories and generate candidate rules outside the public repo.
|
||||
Once a candidate rule has been validated and reviewed, put the shippable source
|
||||
rule YAML in any local folder and compile it into this repo:
|
||||
|
||||
```bash
|
||||
node security/opengrep/compile-rules.mjs \
|
||||
--rules-dir <folder-with-source-rule-yaml>
|
||||
```
|
||||
|
||||
Commit the resulting `security/opengrep/precise.yml` diff. Durable rule
|
||||
provenance lives in each compiled rule's metadata and is checked by
|
||||
`pnpm check:opengrep-rule-metadata`.
|
||||
|
||||
Rule quality contract: precise rules must catch the vulnerable behavior they were
|
||||
written for, should be silent on corresponding fixed behavior when a fix exists,
|
||||
and should keep current findings limited to verified regressions or variants.
|
||||
|
||||
## Writing precise OpenGrep rules
|
||||
|
||||
A rule is appropriate for `security/opengrep/precise.yml` only when the dangerous
|
||||
shape is stable enough to block PRs. Prefer, in order:
|
||||
|
||||
1. **Variant detector** — source-to-sink or missing-guard detection across the
|
||||
same bug family.
|
||||
2. **Scoped behavioral regression** — a narrow subsystem-specific rule anchored
|
||||
on the affected API or trust boundary.
|
||||
3. **Exact regression canary** — a labelled canary for the original vulnerable
|
||||
shape when broader variants would be noisy.
|
||||
4. **No OpenGrep rule** — if runtime state, product policy, or external data is
|
||||
required to distinguish vulnerable and safe behavior.
|
||||
|
||||
Before compiling a rule, validate it against vulnerable/fixed/current code when
|
||||
those surfaces exist. Every current finding must be classified as a true original
|
||||
issue or true variant, or the rule must be tightened/dropped before it ships.
|
||||
|
||||
## Running the rules locally
|
||||
|
||||
The wrapper script handles paths, exclusions, and output formatting so local
|
||||
scans match CI exactly.
|
||||
|
||||
```bash
|
||||
scripts/run-opengrep.sh # precise rules, human output
|
||||
scripts/run-opengrep.sh --json # write .opengrep-out/precise.json
|
||||
scripts/run-opengrep.sh --sarif # write .opengrep-out/precise.sarif
|
||||
scripts/run-opengrep.sh --changed # scan changed first-party paths
|
||||
scripts/run-opengrep.sh -- src/agents/ # scan a single dir
|
||||
```
|
||||
|
||||
If you'd rather invoke `opengrep` directly, the equivalent is:
|
||||
|
||||
```bash
|
||||
opengrep scan --no-strict --no-git-ignore \
|
||||
--config security/opengrep/precise.yml \
|
||||
src/ extensions/ apps/ packages/ scripts/
|
||||
```
|
||||
|
||||
Both forms read `.semgrepignore` at the repo root automatically — that's the
|
||||
single source of truth for which paths are skipped (test files, fixtures, mocks,
|
||||
QA-tooling extensions, test-orchestration scripts, …). Add a glob there if a new
|
||||
test naming convention shows up.
|
||||
|
||||
## Running the rules in CI
|
||||
|
||||
There are two OpenGrep workflows:
|
||||
|
||||
- **OpenGrep — PR Diff** (`.github/workflows/opengrep-precise.yml`) runs on pull
|
||||
requests and executes `scripts/run-opengrep.sh --changed --sarif --error` so
|
||||
findings stay scoped to changed first-party paths.
|
||||
- **OpenGrep — Full** (`.github/workflows/opengrep-precise-full.yml`) is manual
|
||||
dispatch only and executes `scripts/run-opengrep.sh --sarif --error` across
|
||||
the full first-party source set for maintainers who want a repository-wide
|
||||
audit.
|
||||
|
||||
Both workflows:
|
||||
|
||||
- Inherit the same `.semgrepignore` exclusions used by the local wrapper
|
||||
- Upload SARIF to GitHub Code Scanning under stable OpenGrep categories
|
||||
- Fail on precise findings so the rulepack acts as a regression firewall
|
||||
- Enforce committed rule provenance with `pnpm check:opengrep-rule-metadata`
|
||||
|
||||
## Editing, silencing, or removing rules
|
||||
|
||||
`precise.yml` is the checked-in compiled rulepack. Prefer editing source rule
|
||||
YAML and recompiling instead of hand-editing compiled rules, because the compiler
|
||||
normalizes rule IDs, metadata, duplicates, and OpenGrep validation. The compiler
|
||||
appends new rule IDs by default; use `--replace-precise` only when intentionally
|
||||
rebuilding the rulepack from a complete source folder.
|
||||
|
||||
To drop a noisy rule:
|
||||
|
||||
1. Delete the offending source rule from the local source-rule folder.
|
||||
2. Re-run `node security/opengrep/compile-rules.mjs --rules-dir <folder-with-source-rule-yaml>`.
|
||||
3. Commit the resulting `security/opengrep/precise.yml` diff.
|
||||
|
||||
To narrow a rule's path scope, edit the source rule's `paths.include` /
|
||||
`paths.exclude` fields in the same local artifact location and recompile.
|
||||
|
||||
## Tracing a finding back to its source
|
||||
|
||||
Every compiled rule's `id` is `<source-id>.<original-id>`. For GHSA-backed rules,
|
||||
`<source-id>` is the lower-case GHSA ID. For other source-backed rules, use a
|
||||
stable source identifier without dots such as a CVE, OSV ID, internal advisory ID, or other
|
||||
review identifier. Rule `metadata` must include `advisory-url`,
|
||||
`detector-bucket`, and `source-rule-id`, plus either `ghsa` or `advisory-id`.
|
||||
New compilations also add `source-file` when available.
|
||||
`pnpm check:opengrep-rule-metadata` enforces these durable source fields so each
|
||||
committed rule is traceable without a separate committed manifest.
|
||||
Reference in New Issue
Block a user