feat(security): add GHSA detector-review pipeline and OpenGrep CI workflows (#69483)

* feat(security): add GHSA detector-review pipeline and OpenGrep CI workflows [AI-assisted]

Stand up an end-to-end pipeline that turns every published openclaw GitHub
Security Advisory into a reusable OpenGrep rule, and wire the compiled rules
into manual-dispatch GitHub Actions workflows that publish SARIF to GitHub
Code Scanning.

The pipeline is harness-agnostic: any coding-agent CLI (Rovo Dev, Claude
Code, Codex, OpenCode, or anything you can shell out to) can drive it via
the runner script's --harness flag. Built-in adapters cover the four common
harnesses; --harness-cmd '<template>' supports anything else with shell-style
{prompt}/{model}/{output_file} substitution.

Pipeline pieces:

- scripts/run-ghsa-detector-review-batch.mjs runs your chosen coding harness
  in parallel against every advisory using the agent-agnostic detector-review
  spec at security/detector-review/detector-review-spec.md. Each case
  produces an opengrep general-rule.yml (precise) and broad-rule.yml
  (review-aid), plus a coverage-validated report against the vulnerable
  commit's changed files.
- scripts/compile-opengrep-rules.mjs walks a run directory, rewrites each
  rule's id to ghsa-detector.<ghsa>.<orig-id>, injects ghsa/advisory-url/
  detector-bucket/source-rule-id metadata, and uses opengrep itself to drop
  rules with InvalidRuleSchemaError so the published super-configs load
  cleanly.

Compiled outputs:

- security/opengrep/precise.yml     (336 rules)
- security/opengrep/broad.yml       (459 rules)
- security/opengrep/compile-manifest.json    (per-rule provenance map)

CI workflows (manual workflow_dispatch only):

- .github/workflows/opengrep-precise.yml
- .github/workflows/opengrep-broad.yml

Both install a pinned opengrep, run opengrep scan against src/, upload SARIF
to Code Scanning under categories opengrep-precise / opengrep-broad, and use
continue-on-error: true so findings never block the workflow.

Detector-review spec and assets:

- security/detector-review/detector-review-spec.md   the agent-agnostic spec
  the runner injects into each per-case prompt
- security/detector-review/references/{detector-rubric,report-template}.md
- security/detector-review/scripts/init_case.py
- security/prompt-suffix-coverage-first.md   mandatory prompt addendum that
  enforces coverage-first validation (rule must catch the OG vuln, not just
  pass synthetic fixtures)

Docs:

- security/README.md          end-to-end flow, supported harnesses, regen recipe
- security/opengrep/README.md compiled-config details + recompile recipe

* security: tighten GHSA OpenGrep detector workflow

* chore: refine precise opengrep workflow

* chore: remove stale opengrep metadata

* fix: harden GHSA OpenGrep workflow

* ci: split OpenGrep diff and full scans

* chore: remove performance-only opengrep rule

* ci: use OpenGrep installer path

* chore: enforce opengrep rule metadata provenance

* chore: generalize opengrep rule compilation

* docs: align opengrep rulepack guidance

* chore: support generic opengrep rule sources

* fix: validate opengrep rulepack-only changes

---------

Co-authored-by: Jesse Merhi <security-engineering@atlassian.com>
This commit is contained in:
Jesse Merhi
2026-04-30 02:42:20 +10:00
committed by GitHub
parent c7aaa40848
commit 6de9d71bfb
16 changed files with 6488 additions and 0 deletions

View File

@@ -42,6 +42,7 @@ export async function main(argv = process.argv.slice(2)) {
{ name: "runtime sidecar loader guard", args: ["check:runtime-sidecar-loaders"] },
{ name: "tool display", args: ["tool-display:check"] },
{ name: "host env policy", args: ["check:host-env-policy:swift"] },
{ name: "opengrep rule metadata", args: ["check:opengrep-rule-metadata"] },
],
},
{

154
scripts/run-opengrep.sh Executable file
View File

@@ -0,0 +1,154 @@
#!/usr/bin/env bash
# scripts/run-opengrep.sh
#
# Run the OpenClaw precise OpenGrep rulepack against the local working tree
# using the same paths and exclusions as CI. The .semgrepignore at the repo root
# is the single source of truth for skipped paths.
#
# Usage:
# scripts/run-opengrep.sh # precise, human output
# scripts/run-opengrep.sh precise # same
# scripts/run-opengrep.sh --sarif # write SARIF for upload/triage
# scripts/run-opengrep.sh --json # write JSON for ad-hoc parsing
# scripts/run-opengrep.sh --changed # scan changed first-party paths
# scripts/run-opengrep.sh --error # fail non-zero on findings
#
# Optional positional path overrides come last:
# scripts/run-opengrep.sh -- src/agents/ # scan a single dir
#
# Exit code: non-zero on scan errors, and on findings when --error is passed.
set -euo pipefail
BUCKET="precise"
if [[ "${1:-}" == "precise" ]]; then
shift
elif [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
sed -n '2,22p' "$0"
exit 0
elif [[ "${1:-}" == "broad" ]]; then
echo "error: broad OpenGrep rulepacks are not supported in this repo workflow" >&2
exit 64
fi
# Resolve repo root from this script's location so the command works from any cwd.
REPO_ROOT="$(cd "$(dirname "$0")/.." && pwd)"
CONFIG="$REPO_ROOT/security/opengrep/precise.yml"
if [[ ! -f "$CONFIG" ]]; then
echo "error: rulepack not found at $CONFIG" >&2
echo "Recompile with: node security/opengrep/compile-rules.mjs --rules-dir <rules-dir> --out-dir security/opengrep" >&2
exit 66
fi
if ! command -v opengrep >/dev/null 2>&1; then
cat >&2 <<'EOF'
error: 'opengrep' not found on PATH.
Install with one of:
curl -fsSL https://raw.githubusercontent.com/opengrep/opengrep/v1.19.0/install.sh | bash -s -- -v v1.19.0
brew install opengrep/tap/opengrep
pipx install opengrep
(See https://opengrep.dev for other options.)
EOF
exit 127
fi
# Pull off our own flags from the remaining args; pass everything else through to opengrep.
EXTRA_ARGS=()
PATHS_PASSED=0
SAW_DOUBLE_DASH=0
CHANGED_ONLY=0
FAIL_ON_FINDINGS=0
while (( $# > 0 )); do
case "$1" in
--sarif)
mkdir -p "$REPO_ROOT/.opengrep-out"
EXTRA_ARGS+=( "--sarif-output=$REPO_ROOT/.opengrep-out/$BUCKET.sarif" )
shift
;;
--json)
mkdir -p "$REPO_ROOT/.opengrep-out"
EXTRA_ARGS+=( "--json" "--output=$REPO_ROOT/.opengrep-out/$BUCKET.json" )
shift
;;
--changed)
CHANGED_ONLY=1
shift
;;
--error)
FAIL_ON_FINDINGS=1
shift
;;
--)
SAW_DOUBLE_DASH=1
shift
;;
*)
if (( SAW_DOUBLE_DASH )); then
# Treat anything after `--` as a path-positional override
if (( PATHS_PASSED == 0 )); then
PATHS_PASSED=1
EXTRA_ARGS+=( "$1" )
else
EXTRA_ARGS+=( "$1" )
fi
else
EXTRA_ARGS+=( "$1" )
fi
shift
;;
esac
done
cd "$REPO_ROOT"
if (( CHANGED_ONLY && PATHS_PASSED )); then
echo "error: --changed cannot be combined with explicit path overrides" >&2
exit 64
fi
# Default scan paths match CI. Override by passing `-- <paths...>`.
if (( PATHS_PASSED == 0 )); then
if (( CHANGED_ONLY )); then
mapfile -t SCAN_PATHS < <(
{
git diff --name-only --diff-filter=ACMRTUXB "${OPENCLAW_OPENGREP_BASE_REF:-origin/main...HEAD}" 2>/dev/null || true
git diff --name-only --diff-filter=ACMRTUXB -- 2>/dev/null || true
git ls-files --others --exclude-standard
} | awk '/^(src|extensions|apps|packages|scripts)\// { print }' | sort -u
)
mapfile -t RULEPACK_CHANGED_PATHS < <(
{
git diff --name-only --diff-filter=ACMRTUXB "${OPENCLAW_OPENGREP_BASE_REF:-origin/main...HEAD}" 2>/dev/null || true
git diff --name-only --diff-filter=ACMRTUXB -- 2>/dev/null || true
git ls-files --others --exclude-standard
} | awk '/^(security\/opengrep\/|scripts\/run-opengrep\.sh$|\.semgrepignore$|\.github\/workflows\/opengrep-)/ { print }' | sort -u
)
if (( ${#SCAN_PATHS[@]} == 0 && ${#RULEPACK_CHANGED_PATHS[@]} > 0 )); then
SCAN_PATHS=( "security/opengrep/precise.yml" )
fi
if (( ${#SCAN_PATHS[@]} == 0 )); then
echo "→ No changed first-party paths for opengrep." >&2
exit 0
fi
else
SCAN_PATHS=( "src/" "extensions/" "apps/" "packages/" "scripts/" )
fi
else
SCAN_PATHS=()
fi
if (( FAIL_ON_FINDINGS )); then
EXTRA_ARGS+=( "--error" )
fi
echo "→ Running opengrep ($BUCKET) against $(IFS=' '; echo "${SCAN_PATHS[*]:-overridden}")" >&2
echo " Using exclusions from .semgrepignore" >&2
exec opengrep scan \
--no-strict \
--config "$CONFIG" \
--no-git-ignore \
"${EXTRA_ARGS[@]}" \
"${SCAN_PATHS[@]}"