Skip to content

Use relative paths in tree-sitter extractor diagnostics#21844

Open
redsun82 wants to merge 4 commits into
mainfrom
redsun82/issue-21802-ruby-absolute-paths-in-sarif-diagnostics-a02887
Open

Use relative paths in tree-sitter extractor diagnostics#21844
redsun82 wants to merge 4 commits into
mainfrom
redsun82/issue-21802-ruby-absolute-paths-in-sarif-diagnostics-a02887

Conversation

@redsun82
Copy link
Copy Markdown
Contributor

@redsun82 redsun82 commented May 13, 2026

Diagnostic location.file entries from the shared tree-sitter extractor were using absolute paths (e.g. /home/runner/work/repo/src/foo.rb), which caused the GitHub Code Scanning UI to generate broken file links.

This adds relativize_for_diagnostic() to the shared extractor, which strips the source root prefix (CWD during extraction) from diagnostic paths. If the path happens to be outside the source root (shouldn't happen in practice, since the CLI invokes extractors with --working-dir=.), it falls back to the absolute path as-is and lets the CLI's SARIF generator handle it downstream.

The fix covers:

  • All parse error diagnostics in the shared tree-sitter extractor (Ruby, QL, Swift/Unified)
  • Ruby-specific diagnostics (unknown-character-encoding, character-decoding-error) that were emitting paths directly
  • Windows path handling: canonicalizes current_dir() to match canonicalized file paths (avoids \\?\ prefix mismatch), and normalizes backslashes to forward slashes in relative paths

The Python extractor has the same bug but lives in a separate codebase and will need a separate fix.

Fixes #21802.

Diagnostic `location.file` entries were using absolute paths (e.g.
`/home/runner/work/...`), causing broken links in the GitHub UI.
Now relativize against CWD (the source root during extraction), falling
back to a properly percent-encoded `file:` URI for paths outside it.

Fixes #21802

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 13, 2026 07:46
@redsun82 redsun82 requested review from a team as code owners May 13, 2026 07:46
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates shared tree-sitter extractor diagnostics to emit repository-relative file paths where possible, improving Code Scanning file links for parse and Ruby encoding diagnostics.

Changes:

  • Adds a shared diagnostic path relativization helper using file: URI fallback.
  • Routes shared parse diagnostics and Ruby-specific encoding diagnostics through the new helper.
  • Updates dependency metadata and Ruby diagnostics integration expectations.
Show a summary per file
File Description
shared/tree-sitter-extractor/src/file_paths.rs Adds diagnostic path relativization and tests.
shared/tree-sitter-extractor/src/extractor/mod.rs Uses relative diagnostic paths for shared parse errors.
shared/tree-sitter-extractor/Cargo.toml Adds the url dependency.
ruby/extractor/src/extractor.rs Uses relative diagnostic paths for Ruby encoding diagnostics.
ruby/ql/integration-tests/diagnostics/unknown-encoding/diagnostics.expected Updates expected diagnostic file path.
ruby/ql/integration-tests/diagnostics/syntax-error/diagnostics.expected Updates expected parse diagnostic file paths.
Cargo.lock Records the new shared extractor dependency edge.
ql/Cargo.lock Updates QL extractor lockfile dependencies.

Copilot's findings

  • Files reviewed: 6/8 changed files
  • Comments generated: 3

Comment thread shared/tree-sitter-extractor/src/file_paths.rs Outdated
Comment thread shared/tree-sitter-extractor/src/extractor/mod.rs Outdated
Comment thread ruby/extractor/src/extractor.rs Outdated
redsun82 and others added 2 commits May 13, 2026 10:28
Drop the `url` crate dependency. When a path can't be relativized
against the source root, emit it as a bare absolute path and let the
CLI's SARIF generator handle URI conversion downstream.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Canonicalize `current_dir()` to match canonicalized file paths (avoids
`\\?\` prefix mismatch on Windows), and normalize backslashes to
forward slashes in relative diagnostic paths.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GitHub Advanced Security Ruby File Issues Generated Using Absolute Path

2 participants