Skip to content

Commit 88ca227

Browse files
committed
Add sbom-diff-and-risk v0.1.0 release surface
0 parents  commit 88ca227

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+3020
-0
lines changed
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
name: sbom-diff-and-risk-ci
2+
3+
on:
4+
workflow_dispatch:
5+
push:
6+
paths:
7+
- ".github/workflows/sbom-diff-and-risk-ci.yml"
8+
- "tools/sbom-diff-and-risk/**"
9+
pull_request:
10+
paths:
11+
- ".github/workflows/sbom-diff-and-risk-ci.yml"
12+
- "tools/sbom-diff-and-risk/**"
13+
14+
jobs:
15+
test:
16+
runs-on: ubuntu-latest
17+
defaults:
18+
run:
19+
working-directory: tools/sbom-diff-and-risk
20+
steps:
21+
- name: Check out repository
22+
uses: actions/checkout@v4
23+
24+
- name: Set up Python
25+
uses: actions/setup-python@v5
26+
with:
27+
python-version: "3.11"
28+
29+
- name: Upgrade pip
30+
run: python -m pip install --upgrade pip
31+
32+
- name: Install project
33+
run: python -m pip install -e .[dev]
34+
35+
- name: Run test suite
36+
run: python -m pytest
37+
38+
- name: CLI smoke test
39+
shell: bash
40+
run: |
41+
tmpdir="$(mktemp -d)"
42+
python -m sbom_diff_risk.cli compare \
43+
--before examples/cdx_before.json \
44+
--after examples/cdx_after.json \
45+
--format auto \
46+
--out-json "$tmpdir/report.json" \
47+
--out-md "$tmpdir/report.md"
48+
test -f "$tmpdir/report.json"
49+
test -f "$tmpdir/report.md"
50+
diff -u examples/sample-report.json "$tmpdir/report.json"
51+
diff -u examples/sample-report.md "$tmpdir/report.md"
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
.pytest_cache/
2+
outputs/
3+
src/*.egg-info/
4+
src/**/*.egg-info/
5+
*.pyc
6+
*.pyo
7+
*.pyd
8+
__pycache__/

tools/sbom-diff-and-risk/README.md

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# sbom-diff-and-risk
2+
3+
`sbom-diff-and-risk` is a local, deterministic CLI for comparing two SBOMs or dependency manifests and producing JSON plus Markdown reports.
4+
5+
It uses conservative heuristics for change intelligence. By default it does not resolve CVEs, does not act as a reputation oracle, and does not perform hidden network enrichment.
6+
7+
## Scope
8+
9+
- Normalize two local inputs into a shared component schema.
10+
- Diff components as `added`, `removed`, and `changed`.
11+
- Apply conservative, heuristic risk buckets to newly added and changed components.
12+
- Produce machine-friendly JSON and reviewer-friendly Markdown reports.
13+
- Stay fully local-file based by default.
14+
15+
## v0.1 Internal Component Model
16+
17+
The normalized schema is the core design choice for the project:
18+
19+
- `name: str`
20+
- `version: str | None`
21+
- `ecosystem: str`
22+
- `purl: str | None`
23+
- `license_id: str | None`
24+
- `supplier: str | None`
25+
- `source_url: str | None`
26+
- `bom_ref: str | None`
27+
- `raw_type: str | None`
28+
- `evidence: dict`
29+
30+
Diff identity is intentionally conservative and uses this precedence:
31+
32+
1. `purl`
33+
2. `bom_ref`
34+
3. `(ecosystem, name)`
35+
36+
When a `purl` includes a version, the tool keeps the full value in `Component.purl` for auditability but uses the versionless package coordinate for identity so upgrades still diff as `changed`.
37+
38+
## Non-goals
39+
40+
- No vulnerability database integration in v0.1.
41+
- No CVE, advisory, or exploit resolution in v0.1.
42+
- No reputation scoring or malware verdicts.
43+
- No hidden enrichment or implicit network access.
44+
- No web UI.
45+
46+
## Supported Formats
47+
48+
- CycloneDX JSON
49+
- SPDX JSON
50+
- `requirements.txt`
51+
- `pyproject.toml`
52+
53+
## Risk Bucket Semantics
54+
55+
The current heuristic buckets are:
56+
57+
- `new_package`
58+
- `major_upgrade`
59+
- `version_change_unclassified`
60+
- `unknown_license`
61+
- `stale_package`
62+
- `suspicious_source`
63+
- `not_evaluated`
64+
65+
Offline `stale_package` evaluation is intentionally deferred. When enrichment is disabled, the tool emits `not_evaluated` findings instead of guessing.
66+
67+
## Output Formats
68+
69+
- `report.json`
70+
- `report.md`
71+
72+
## Install
73+
74+
```bash
75+
python -m pip install -e .[dev]
76+
```
77+
78+
## Usage
79+
80+
Generate reports from the bundled CycloneDX example inputs:
81+
82+
```bash
83+
sbom-diff-risk compare \
84+
--before examples/cdx_before.json \
85+
--after examples/cdx_after.json \
86+
--format auto \
87+
--out-json outputs/report.json \
88+
--out-md outputs/report.md
89+
```
90+
91+
Generate reports from the `requirements.txt` examples:
92+
93+
```bash
94+
sbom-diff-risk compare \
95+
--before examples/requirements_before.txt \
96+
--after examples/requirements_after.txt \
97+
--format auto \
98+
--out-json outputs/requirements-report.json \
99+
--out-md outputs/requirements-report.md
100+
```
101+
102+
Use explicit format flags when you do not want auto-detection:
103+
104+
```bash
105+
sbom-diff-risk compare \
106+
--before examples/spdx_before.json \
107+
--after examples/spdx_after.json \
108+
--before-format spdx-json \
109+
--after-format spdx-json \
110+
--out-json outputs/spdx-report.json \
111+
--out-md outputs/spdx-report.md
112+
```
113+
114+
Generate reports from PEP 621 `pyproject.toml` examples:
115+
116+
```bash
117+
sbom-diff-risk compare \
118+
--before examples/pyproject_before.toml \
119+
--after examples/pyproject_after.toml \
120+
--format auto \
121+
--out-json outputs/pyproject-report.json \
122+
--out-md outputs/pyproject-report.md
123+
```
124+
125+
## CLI Flags
126+
127+
- `--before path`
128+
- `--after path`
129+
- `--format auto|cyclonedx-json|spdx-json|requirements-txt|pyproject-toml`
130+
- `--before-format cyclonedx-json|spdx-json|requirements-txt|pyproject-toml`
131+
- `--after-format cyclonedx-json|spdx-json|requirements-txt|pyproject-toml`
132+
- `--out-json path`
133+
- `--out-md path`
134+
- `--strict`
135+
- `--enrich-pypi`
136+
- `--source-allowlist pypi.org,files.pythonhosted.org,github.com`
137+
138+
`--enrich-pypi` is reserved for future work and currently returns a clear error.
139+
140+
## Examples
141+
142+
The [`examples/`](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples) directory includes:
143+
144+
- before/after inputs for CycloneDX JSON, SPDX JSON, `requirements.txt`, and `pyproject.toml`
145+
- a sample CycloneDX-based JSON report at [`sample-report.json`](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-report.json)
146+
- a sample CycloneDX-based Markdown report at [`sample-report.md`](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-report.md)
147+
- requirements-based sample reports at [`sample-requirements-report.json`](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-requirements-report.json) and [`sample-requirements-report.md`](D:/OneDrive/Code/scientific-computing-toolkit/tools/sbom-diff-and-risk/examples/sample-requirements-report.md)
148+
149+
## Limitations
150+
151+
- v0.1 is local-file based only.
152+
- `generated_at` remains `null` to preserve deterministic report output.
153+
- `stale_package` is not resolved offline. The report emits `not_evaluated` instead.
154+
- No vulnerability database integration, CVE matching, or advisory enrichment.
155+
- `requirements.txt` support intentionally covers a conservative subset: plain PEP 508 requirement entries, comments, direct URL requirements, and line continuations.
156+
- `requirements.txt` intentionally does not support pip include/constraint directives such as `-r`, `-c`, or arbitrary install flags in v0.1.
157+
- `pyproject.toml` support intentionally covers a conservative subset: PEP 621 `[project.dependencies]` and `[project.optional-dependencies]`.
158+
- `pyproject.toml` intentionally does not support tool-specific layouts such as Poetry, Hatch, or PDM sections in v0.1.
159+
- Risk buckets are heuristics, not security verdicts.
160+
- Runtime-generated `outputs/` artifacts are ignored; tracked examples live in `examples/`.
161+
162+
## Current Status
163+
164+
The project now normalizes local CycloneDX JSON, SPDX JSON, `requirements.txt`, and PEP 621 `pyproject.toml` inputs into the shared component model, diffs them deterministically, and generates stable JSON/Markdown reports with golden tests.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# v0.1.0
2+
3+
- Added deterministic diffing for CycloneDX JSON, SPDX JSON, requirements.txt, and pyproject.toml
4+
- Added conservative risk buckets for new packages, major upgrades, unknown licenses, suspicious sources, and opt-in future stale evaluation
5+
- Added stable JSON/Markdown reporting with golden tests
6+
- Clarified scope: no CVE matching, no hidden enrichment, no reputation scoring by default
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Dependency risk heuristics
2+
3+
`sbom-diff-and-risk` classifies change-related heuristics. It does not claim vulnerability truth.
4+
5+
## Implemented buckets
6+
7+
The current rules are intentionally conservative:
8+
9+
- `new_package`: a component appears only in the after input.
10+
- `major_upgrade`: strict SemVer `x.y.z` major version increase.
11+
- `version_change_unclassified`: version changed, but not a clear SemVer major bump.
12+
- `unknown_license`: license metadata is missing or explicitly unknown.
13+
- `suspicious_source`: provenance fields are missing or use suspicious schemes or hosts.
14+
- `stale_package`: reserved for future enrichment work. When enrichment is disabled, the tool emits `not_evaluated` instead of guessing.
15+
16+
## Conservative rule notes
17+
18+
- `new_package` is a change signal, not a vulnerability claim.
19+
- `major_upgrade` fires only when both versions look reliably parseable as strict SemVer.
20+
- uncertain version changes fall back to `version_change_unclassified`.
21+
- suspicious source is a provenance-quality heuristic, not a malware verdict.
22+
- missing metadata is reported as unknown rather than silently treated as safe.
23+
- `not_evaluated` means the stale-package question was intentionally left unanswered offline.
24+
25+
## Deferred work
26+
27+
- real `stale_package` evaluation behind explicit enrichment
28+
- ecosystem-specific trust rules
29+
- advisory and CVE enrichment
30+
- configurable risk policy profiles
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# SBOM basics
2+
3+
This project treats SBOMs as one possible source of dependency inventory data.
4+
5+
For v0.1, the tool is intentionally limited to local-file parsing, normalization, diffing, and heuristic reporting.
6+
7+
## Supported local inputs
8+
9+
- CycloneDX JSON
10+
- SPDX JSON
11+
- `requirements.txt`
12+
- `pyproject.toml`
13+
14+
## Intentional parser boundaries
15+
16+
`requirements.txt` support is intentionally conservative in v0.1:
17+
18+
- supported: plain PEP 508 requirement entries
19+
- supported: comments and blank lines
20+
- supported: direct URL requirements
21+
- supported: line continuations
22+
- not supported: `-r`, `--requirement`, `-c`, `--constraint`, or arbitrary pip install flags
23+
24+
`pyproject.toml` support is intentionally conservative in v0.1:
25+
26+
- supported: PEP 621 `[project.dependencies]`
27+
- supported: PEP 621 `[project.optional-dependencies]`
28+
- not supported: Poetry, Hatch, PDM, or other tool-specific dependency sections
29+
30+
These boundaries are deliberate so the tool can stay deterministic and explicit about what it does and does not parse.
31+
32+
## Normalization goals
33+
34+
- keep one internal `Component` model
35+
- preserve source evidence for auditability
36+
- prefer purl identity when available
37+
- stay deterministic and local-file based
38+
39+
## Diff identity precedence
40+
41+
1. `purl`
42+
2. `bom_ref`
43+
3. `(ecosystem, name)`
44+
45+
When a purl includes a version, the full purl is retained for auditability, but the diff identity uses the versionless package coordinate so upgrades still classify as `changed`.
46+
47+
## Outputs
48+
49+
- `report.json` for machine consumption
50+
- `report.md` for human review
51+
52+
## What this tool is not
53+
54+
- not a vulnerability scanner
55+
- not a package resolver
56+
- not a provenance verifier
57+
- not a web service
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
{
2+
"bomFormat": "CycloneDX",
3+
"specVersion": "1.5",
4+
"version": 1,
5+
"components": [
6+
{
7+
"bom-ref": "pkg:pypi/requests@2.32.0",
8+
"type": "library",
9+
"name": "requests",
10+
"version": "2.32.0",
11+
"purl": "pkg:pypi/requests@2.32.0",
12+
"supplier": {
13+
"name": "Python Software Foundation"
14+
},
15+
"licenses": [
16+
{
17+
"license": {
18+
"id": "Apache-2.0"
19+
}
20+
}
21+
],
22+
"externalReferences": [
23+
{
24+
"type": "website",
25+
"url": "https://pypi.org/project/requests/"
26+
}
27+
]
28+
},
29+
{
30+
"bom-ref": "pkg:pypi/urllib3@2.2.1",
31+
"type": "library",
32+
"name": "urllib3",
33+
"version": "2.2.1",
34+
"purl": "pkg:pypi/urllib3@2.2.1",
35+
"licenses": [
36+
{
37+
"license": {
38+
"id": "MIT"
39+
}
40+
}
41+
],
42+
"externalReferences": [
43+
{
44+
"type": "website",
45+
"url": "https://pypi.org/project/urllib3/"
46+
}
47+
]
48+
}
49+
]
50+
}

0 commit comments

Comments
 (0)