Skip to content

Conversation

@Kaushik-Kumar-CEG
Copy link

@Kaushik-Kumar-CEG Kaushik-Kumar-CEG commented Jan 29, 2026

Fixes #2985

Summary

Fixed strip_root parameter not stripping root directory from Resource paths when using cli.run_scan() programmatically.

Problem

  • CLI --strip-root worked correctly
  • But, API mode with return_codebase=True did not strip root from Resource paths
  • This is the programmatic API mode that returns a Codebase object instead of JSON

Solution

  • Rebuilt resources_by_path with stripped paths after Codebase creation
  • Used strip_first_path_segment() to remove root prefix from each Resource
  • Fixed Resource.parent() method to return root Resource (not empty string) for direct children of root by monkey-patching when strip_root is enabled

Changes

  • src/scancode/cli.py: Added path stripping logic and parent() fix for return_codebase branch

Verification

  • All existing tests pass
  • Manual testing confirms root directory is now stripped from paths
  • Behavior matches CLI --strip-root flag
  • Resource.parent(codebase) correctly returns root Resource for direct children

Screenshots of Fix

BEFORE fix:
Screenshot 2026-01-29 212317

AFTER Fix
Screenshot 2026-01-29 211836

Tasks

  • Reviewed contribution guidelines
  • PR is descriptively titled 📑 and links the original issue above 🔗
  • Tests pass -- look for a green checkbox ✔️ a few minutes after opening your PR
    Run tests locally to check for errors.
  • Commits are in uniquely-named feature branch and has no merge conflicts 📁
  • Updated documentation pages (if applicable)
  • Updated CHANGELOG.rst (if applicable)

Signed-off-by: Kaushik <kaushikrjpm10@gmail.com>
@Kaushik-Kumar-CEG Kaushik-Kumar-CEG force-pushed the fix-strip-root-api-issue-2985 branch from 447a642 to 5c4dfb8 Compare January 30, 2026 18:27
Signed-off-by: Kaushik <kaushikrjpm10@gmail.com>
@Kaushik-Kumar-CEG
Copy link
Author

Kaushik-Kumar-CEG commented Jan 31, 2026

@JonoYang This PR is ready for review

The issue was that strip_root=True had no effect when using return_codebase=True because commoncode's Codebase ignores the parameter. The stripping only worked during JSON serialization, not on the actual Resource objects.

The fix rebuilds resources_by_path with stripped paths after Codebase creation. Additionally, it patches Resource.parent() because commoncode's implementation uses return parent_path and get_resource(parent_path) which returns early on empty strings when a direct child's parent path becomes '' after stripping.

Kindly review my fix and let me know if you have any feedback :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Setting strip_root=True when calling cli.run_scan() does not strip the root from Resource paths

1 participant