Skip to content

Conversation

@Jayant-kernel
Copy link

Implements PipCacheWheelHandler to parse origin.json files created by pip when caching built wheels. Extracts package metadata including name, version, download URL, and SHA256 hash.

Fixes #4220


Summary

Adds support for parsing pip wheel cache origin.json files to detect cached Python packages.

Implementation

  • Added PipCacheWheelHandler in src/packagedcode/pypi.py
  • Parses origin.json files from ~/.cache/pip/wheels/ directories
  • Extracts package name, version, download URL, and SHA256 hash
  • Registered handler in APPLICATION_PACKAGE_DATAFILE_HANDLERS

Testing

  • Added test case test_parse_pip_cache_wheel in tests/packagedcode/test_pypi.py
  • Includes real construct package origin.json as test fixture
  • Validates correct extraction of all package metadata

Example

Input: ~/.cache/pip/wheels/.../origin.json

{
  "archive_info": {"hash": "sha256=7b2a3fd8e5f597a5aa1d614c3bd516fa065db01704c72a1efaaeec6ef23d8b45"},
  "url": "https://files.pythonhosted.org/packages/.../construct-2.10.68.tar.gz"
}

Output:

  • Name: construct
  • Version: 2.10.68
  • PURL: pkg:pypi/construct@2.10.68
  • SHA256 hash extracted and cleaned

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (not applicable - internal handler)
  • Updated CHANGELOG.rst (not applicable - maintainers handle this)

---

## **CHECKLIST EXPLANATION:**

- [x] Reviewed contribution guidelines
- [x] PR is descriptively titled  
- [x] Tests pass (you ran them locally)
- [x] Commits in feature branch (fix-4220-pip-cache-parser)

Signed-off-by: Jayant [jayantmcom@gmail.com](mailto:jayantmcom@gmail.com)

Fixes aboutcode-org#4220

Implements PipCacheWheelHandler to parse origin.json files created by pip when caching built wheels. Extracts package metadata including name, version, download URL, and SHA256 hash.

Signed-off-by: Jayant <jayantmcom@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parse pip cache dir

1 participant