fix(node): enforce path-scoped visibility on the REST read API by beardthelion · Pull Request #52 · Gitlawb/node

beardthelion · 2026-06-13T16:39:46Z

The git smart-HTTP and replication paths enforce path-scoped visibility, but the REST read endpoints never did. They look up the repo and read straight from disk with no visibility check, so a private repository, and a private subtree of a public repository, are served to anyone.

The clearest case:

GET /api/v1/repos/{owner}/{repo}/blob/secret/.env   ->  200, cleartext

get_pr_diff is the same story for diffs. This predates the visibility work (get_blob goes back to the initial release), so it is a live hole on main today, not something introduced by the in-flight subtree work. It does, however, make that work moot, which is why it is worth fixing on its own.

What this does

Adds a single authorize_repo_read gate (resolve repo, load rules, visibility_check, 404 on deny) and routes every repo-scoped read handler through it: blob, tree, commits, refs, changelog, pulls, PR diff, and repo metadata. The 404 matches what the git path returns on a denied read, so existence is not confirmed either.

get_blob gates at the requested file path, so a withheld subtree file and any file in a fully-private repo both return 404, while the owner and listed readers still get them.
get_pr_diff withholds the whole diff if it touches any path the caller cannot read, via a new store::branch_diff_names. Whole-diff denial rather than per-hunk redaction keeps it simple and safe; finer redaction can come later if wanted.
Listings (tree, commits, refs) are gated at the repo root only, which keeps them consistent with mode B: the git protocol already exposes tree and commit SHAs, only blob content is withheld.
The read routes get the optional_signature layer so an authorized reader can authenticate over REST, the same way the git read routes already do. Anonymous reads of public repos are unaffected.

Not covered here

list_repos / list_federated_repos enumerate repos and need per-row filtering, a different mechanism. They leak private-repo existence and should be a separate change.

Testing

cargo build and cargo test -p gitlawb-node --bin gitlawb-node pass. store::branch_diff_names has a unit test; the gate decision itself is covered by the existing visibility unit tests. There is no HTTP-level test harness in this crate, so endpoint-level integration tests would be a follow-up.

Closes #51

Summary by CodeRabbit

New Features
- Optional HTTP-signature authentication for public repository read endpoints.
- Branch-diff path detection to determine which file changes are visible.
Security
- Repository read operations now enforce per-path visibility before returning data.
- Pull request diffs and changelogs hide changes that include inaccessible paths; forking checks source repo readability.
Tests
- Added unit test for branch-diff path detection.

The repo-scoped REST read endpoints (blob, tree, commits, refs, changelog, pulls, pr diff, repo metadata) read straight from disk after a bare repo lookup, with no visibility check, so private repositories and private subtrees of public repositories were served to unauthenticated callers, bypassing the withholding the git and replication paths enforce. Add a shared authorize_repo_read gate (resolve repo, load rules, visibility_check, 404 on deny) and call it from every repo-scoped read handler. get_blob gates at the requested file path; get_pr_diff withholds the whole diff when it touches a path the caller cannot read. Attach optional_signature to the read routes so authorized readers authenticate. Listings stay gated at the repo root, consistent with mode B exposing tree and commit SHAs over git. Closes Gitlawb#51

coderabbitai · 2026-06-13T16:39:56Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 03569094-bd6a-4e8f-b507-b8bed9441940

📥 Commits

Reviewing files that changed from the base of the PR and between f3c5d33 and af8408f.

📒 Files selected for processing (1)

crates/gitlawb-node/src/git/store.rs

🚧 Files skipped from review as they are similar to previous changes (1)

crates/gitlawb-node/src/git/store.rs

📝 Walkthrough

Walkthrough

Adds repository- and path-scoped read authorization across REST read endpoints: introduces authorize_repo_read, applies optional-signature middleware to read routes, wires optional-auth through handlers, enforces repo-level checks for metadata/blob/PR/changelog endpoints, and gates PR diffs per changed file path.

Changes

REST read authorization enforcement

Layer / File(s)	Summary
Middleware and authorization helper foundation `crates/gitlawb-node/src/server.rs`, `crates/gitlawb-node/src/api/mod.rs`	Router applies optional-signature middleware to read routes; new `authorize_repo_read(state, owner, repo, caller, path)` loads repo and visibility rules, runs `visibility_check`, and returns `RepoNotFound` when denied.
Repository metadata endpoints authorization `crates/gitlawb-node/src/api/repos.rs`	`get_repo`, `list_commits`, `get_tree_root`, `get_tree`, `list_refs`, and `fork_repo` now call `authorize_repo_read` (path `"/"`) to enforce repo-level read visibility before returning metadata or allowing forks.
Blob endpoint with path-scoped authorization `crates/gitlawb-node/src/api/repos.rs`	`get_blob` computes a `gate_path` from the requested file path and calls `authorize_repo_read` with that path to enforce subtree-level visibility before serving file bytes.
Pull request endpoints with diff visibility `crates/gitlawb-node/src/api/pulls.rs`	`list_prs`, `get_pr`, `list_reviews`, `list_comments` use optional auth and `authorize_repo_read` for repo-level checks. `get_pr_diff` computes touched branch diff paths and runs `visibility_check` per-path, returning `NotFound` if any path is denied.
Changelog endpoint authorization `crates/gitlawb-node/src/api/changelog.rs`	`get_changelog` accepts optional authenticated DID, derives caller, and uses `authorize_repo_read` to enforce repo-level read authorization before building the changelog response.
Git diff paths utility `crates/gitlawb-node/src/git/store.rs`	Adds `branch_diff_names(repo_path, target_branch, source_branch)` which shells out to `git diff --name-only target...source` and returns changed repo-relative paths; includes a unit test.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

#51: Matches the PR's objective to close the REST read API bypass by adding repo-level and path-level visibility checks across REST handlers.
#18: Similar visibility enforcement work; this PR extends path-scoped visibility wiring which aligns with objectives described in issue #18.

Possibly related PRs

Gitlawb/node#25: Introduced the visibility rules and visibility_check patterns that this PR reuses and extends across REST endpoints.

Suggested reviewers

kevincodex1

Poem

🐰 I hopped through routes both wide and narrow,
I sniffed for secrets hidden in a sparrow.
A helper stands guard and checks each trail,
Quiet bytes tucked safe beyond the veil.
🥕✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: enforcing path-scoped visibility on REST read API endpoints, which matches the core objective of the PR.
Linked Issues check	✅ Passed	The PR comprehensively addresses all primary objectives from issue `#51`: repo-level read gating for all read endpoints, path-level gating for blob/diff endpoints, optional_signature middleware for authenticated access, and correct 404 behavior on denial.
Out of Scope Changes check	✅ Passed	All changes directly implement the visibility enforcement requirements specified in issue `#51`. The new authorize_repo_read function, path-specific gating logic, branch_diff_names helper, and optional_signature middleware are all within scope.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

crates/gitlawb-node/src/api/repos.rs (1)
896-906: ⚠️ Potential issue | 🔴 Critical

Enforce source-repo read visibility in fork_repo

fork_repo authenticates the caller, but it never calls authorize_repo_read/visibility_check for the source repo before RepoStore::acquire(...) and git clone --mirror. Since RepoStore::acquire only ensures the repo is present locally (and can download from Tigris) without visibility checks, an authenticated user can fork/clone a private repo they don’t have read access to. Add crate::api::authorize_repo_read(&state, &owner, &name, Some(&auth.0), "/").await? (and return the opaque RepoNotFound behavior on deny) before acquiring/cloning.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/repos.rs` around lines 896 - 906, In fork_repo,
before calling RepoStore::acquire or doing the git clone, call
crate::api::authorize_repo_read(&state, &owner, &name, Some(&auth.0),
"/").await? to enforce read visibility for the source repo; place this
immediately after resolving source via state.db.get_repo(...) and before any
RepoStore::acquire or clone logic so denied access returns the same
AppError::RepoNotFound behavior, ensuring only authorized readers can fork/clone
private repos.
crates/gitlawb-node/src/api/pulls.rs (1)
340-358: ⚠️ Potential issue | 🟠 Major

Gate repo visibility in list_reviews / list_comments to prevent private repo data leakage

GET /api/v1/repos/:owner/:repo/pulls/:number/reviews and /comments are routed under read_routes with auth::optional_signature, but the handlers don’t accept/use AuthenticatedDid and don’t call crate::api::authorize_repo_read. Since the DB queries for PRs/reviews/comments don’t perform visibility checks either, an unauthenticated caller can enumerate {owner}/{repo}/{number} and read private repo review/comment bodies.

Apply the same authorization pattern as list_prs/get_pr (add auth: Option<Extension<AuthenticatedDid>> and call authorize_repo_read before querying PR/reviews/comments; pass the appropriate gate path for these endpoints).
🔒 Suggested fix for list_reviews
 pub async fn list_reviews(
     State(state): State<AppState>,
     Path((owner, name, number)): Path<(String, String, i64)>,
+    auth: Option<Extension<AuthenticatedDid>>,
 ) -> Result<Json<serde_json::Value>> {
-    let record = state
-        .db
-        .get_repo(&owner, &name)
-        .await?
-        .ok_or_else(|| AppError::RepoNotFound(format!("{owner}/{name}")))?;
+    let caller = auth.as_ref().map(|e| e.0 .0.as_str());
+    let (record, _rules) =
+        crate::api::authorize_repo_read(&state, &owner, &name, caller, "/").await?;

     let pr = state
         .db
         .get_pr(&record.id, number)
         .await?
         .ok_or_else(|| AppError::NotFound(format!("PR #{number} not found")))?;

     let reviews = state.db.list_pr_reviews(&pr.id).await?;
     Ok(Json(serde_json::json!({ "reviews": reviews })))
 }
🔒 Suggested fix for list_comments
 pub async fn list_comments(
     State(state): State<AppState>,
     Path((owner, name, number)): Path<(String, String, i64)>,
+    auth: Option<Extension<AuthenticatedDid>>,
 ) -> Result<Json<serde_json::Value>> {
-    let record = state
-        .db
-        .get_repo(&owner, &name)
-        .await?
-        .ok_or_else(|| AppError::RepoNotFound(format!("{owner}/{name}")))?;
+    let caller = auth.as_ref().map(|e| e.0 .0.as_str());
+    let (record, _rules) =
+        crate::api::authorize_repo_read(&state, &owner, &name, caller, "/").await?;

     let pr = state
         .db
         .get_pr(&record.id, number)
         .await?
         .ok_or_else(|| AppError::NotFound(format!("PR #{number} not found")))?;

     let comments = state.db.list_pr_comments(&pr.id).await?;
     Ok(Json(serde_json::json!({ "comments": comments })))
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/api/pulls.rs` around lines 340 - 358, list_reviews
and list_comments lack repo visibility checks and must accept the optional auth
extension and call authorize_repo_read before DB access: add a new parameter
auth: Option<Extension<AuthenticatedDid>> to each handler signature, then before
calling state.db.get_pr / state.db.list_pr_reviews or list_pr_comments invoke
crate::api::authorize_repo_read(&state, auth.as_ref().map(|e| &**e), &record,
"<appropriate_gate_path>") following the same pattern used in list_prs/get_pr
(use the same gate-path constant used for pull read endpoints), and only then
perform the DB queries; this ensures unauthenticated callers cannot read private
PR reviews/comments.

🧹 Nitpick comments (1)

crates/gitlawb-node/src/git/store.rs (1)

317-339: 💤 Low value

Consider checking output.status.success() for consistency and explicit error handling.

Unlike list_refs, head_commit, and ls_tree which check the exit status and report failures, this function silently returns an empty Vec if the git command fails (e.g., invalid branch refs). While the caller's subsequent branch_diff call would also fail with the same inputs—mitigating practical impact—explicit status checking would provide clearer diagnostics and align with the pattern used elsewhere in this file.

♻️ Suggested fix for exit status check

 pub fn branch_diff_names(
     repo_path: &Path,
     target_branch: &str,
     source_branch: &str,
 ) -> Result<Vec<String>> {
     let output = Command::new("git")
         .args([
             "diff",
             "--name-only",
             &format!("{target_branch}...{source_branch}"),
         ])
         .current_dir(repo_path)
         .output()
         .context("failed to run git diff --name-only")?;
+
+    if !output.status.success() {
+        let stderr = String::from_utf8_lossy(&output.stderr);
+        bail!("git diff --name-only failed: {stderr}");
+    }
+
     Ok(String::from_utf8_lossy(&output.stdout)
         .lines()
         .filter(|l| !l.is_empty())
         .map(|l| l.to_string())
         .collect())
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/git/store.rs` around lines 317 - 339, The
branch_diff_names function currently ignores the git exit status and can
silently return an empty Vec on failure; update branch_diff_names to check
output.status.success() after running the Command (using the existing output
variable) and return a contextual error (e.g., via .context or anyhow::bail)
including the git exit code and stderr if the command failed, so failures for
the given repo_path, target_branch, and source_branch are reported consistently
with list_refs/head_commit/ls_tree.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/gitlawb-node/src/api/changelog.rs`:
- Around line 7-8: Reorder the crate imports so they are alphabetized: move the
crate::auth::AuthenticatedDid import above crate::error::{AppError, Result}
(i.e., change the import block that currently contains crate::error::{AppError,
Result} and crate::auth::AuthenticatedDid so that crate::auth comes first) to
satisfy cargo fmt/CI.

---

Outside diff comments:
In `@crates/gitlawb-node/src/api/pulls.rs`:
- Around line 340-358: list_reviews and list_comments lack repo visibility
checks and must accept the optional auth extension and call authorize_repo_read
before DB access: add a new parameter auth: Option<Extension<AuthenticatedDid>>
to each handler signature, then before calling state.db.get_pr /
state.db.list_pr_reviews or list_pr_comments invoke
crate::api::authorize_repo_read(&state, auth.as_ref().map(|e| &**e), &record,
"<appropriate_gate_path>") following the same pattern used in list_prs/get_pr
(use the same gate-path constant used for pull read endpoints), and only then
perform the DB queries; this ensures unauthenticated callers cannot read private
PR reviews/comments.

In `@crates/gitlawb-node/src/api/repos.rs`:
- Around line 896-906: In fork_repo, before calling RepoStore::acquire or doing
the git clone, call crate::api::authorize_repo_read(&state, &owner, &name,
Some(&auth.0), "/").await? to enforce read visibility for the source repo; place
this immediately after resolving source via state.db.get_repo(...) and before
any RepoStore::acquire or clone logic so denied access returns the same
AppError::RepoNotFound behavior, ensuring only authorized readers can fork/clone
private repos.

---

Nitpick comments:
In `@crates/gitlawb-node/src/git/store.rs`:
- Around line 317-339: The branch_diff_names function currently ignores the git
exit status and can silently return an empty Vec on failure; update
branch_diff_names to check output.status.success() after running the Command
(using the existing output variable) and return a contextual error (e.g., via
.context or anyhow::bail) including the git exit code and stderr if the command
failed, so failures for the given repo_path, target_branch, and source_branch
are reported consistently with list_refs/head_commit/ls_tree.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b43a4f35-e0ba-4ea5-8b31-7b2bb7f056d7

📥 Commits

Reviewing files that changed from the base of the PR and between a2217bf and 86b6418.

📒 Files selected for processing (6)

crates/gitlawb-node/src/api/changelog.rs
crates/gitlawb-node/src/api/mod.rs
crates/gitlawb-node/src/api/pulls.rs
crates/gitlawb-node/src/api/repos.rs
crates/gitlawb-node/src/git/store.rs
crates/gitlawb-node/src/server.rs

The REST visibility gate added in this PR missed three read surfaces: list_reviews and list_comments served a private repo's PR reviews and comments to unauthenticated callers, and fork_repo cloned a full mirror of the source without checking the caller could read it. Route all three through authorize_repo_read at the repo root, matching the git-protocol gate. Also make branch_diff_names fail closed: it ignored git's exit status and returned an empty path list on error, which would let get_pr_diff serve a diff it should have withheld. Fix changelog.rs import order to satisfy cargo fmt.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/gitlawb-node/src/git/store.rs (1)

326-342: ⚠️ Potential issue | 🔴 Critical

Use NUL-delimited git diff names for branch_diff_names path-gate enforcement

branch_diff_names parses git diff --name-only output with .lines() (repo-relative touched paths). For filenames containing newlines, git diff --name-only emits quoted/escaped text (e.g. "secret/line\\nbreak.txt"), which will not match the real repo path strings expected by get_pr_diff’s gate = format!("/{}", p.trim_start_matches('/')) against visibility_check globs like /secret/**. In a public repo where / is allowed but /secret/** is denied, this mismatch can cause the deny check to miss restricted touched paths and disclose the whole PR diff.

Switch to git diff --name-only -z and split on NUL to preserve exact path bytes.

Suggested patch

 pub fn branch_diff_names(
     repo_path: &Path,
     target_branch: &str,
     source_branch: &str,
 ) -> Result<Vec<String>> {
     let output = Command::new("git")
         .args([
             "diff",
             "--name-only",
+            "-z",
             &format!("{target_branch}...{source_branch}"),
         ])
         .current_dir(repo_path)
         .output()
         .context("failed to run git diff --name-only")?;
     if !output.status.success() {
         let stderr = String::from_utf8_lossy(&output.stderr);
         bail!("git diff --name-only failed: {stderr}");
     }
-    Ok(String::from_utf8_lossy(&output.stdout)
-        .lines()
-        .filter(|l| !l.is_empty())
-        .map(|l| l.to_string())
-        .collect())
+    Ok(output
+        .stdout
+        .split(|b| *b == b'\0')
+        .filter(|s| !s.is_empty())
+        .map(|s| String::from_utf8_lossy(s).into_owned())
+        .collect())
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gitlawb-node/src/git/store.rs` around lines 326 - 342, The
branch_diff_names implementation currently parses git diff output with .lines()
which misparses filenames containing newlines; update branch_diff_names to call
git with args
["diff","--name-only","-z",&format!("{target_branch}...{source_branch}")] and
then split the output on '\0' (e.g.,
String::from_utf8_lossy(&output.stdout).split('\0')) filtering out empty entries
and collecting to Vec<String>, leaving the existing error handling for
output.status in place; this preserves exact repo-relative path bytes so the
gate construction in get_pr_diff (gate = format!("/{}",
p.trim_start_matches('/'))) and visibility_check globs (e.g., /secret/**) will
match correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/gitlawb-node/src/git/store.rs`:
- Around line 326-342: The branch_diff_names implementation currently parses git
diff output with .lines() which misparses filenames containing newlines; update
branch_diff_names to call git with args
["diff","--name-only","-z",&format!("{target_branch}...{source_branch}")] and
then split the output on '\0' (e.g.,
String::from_utf8_lossy(&output.stdout).split('\0')) filtering out empty entries
and collecting to Vec<String>, leaving the existing error handling for
output.status in place; this preserves exact repo-relative path bytes so the
gate construction in get_pr_diff (gate = format!("/{}",
p.trim_start_matches('/'))) and visibility_check globs (e.g., /secret/**) will
match correctly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 3b10a787-781a-4f6a-82e0-c28be4ddaee8

📥 Commits

Reviewing files that changed from the base of the PR and between 86b6418 and f3c5d33.

📒 Files selected for processing (4)

crates/gitlawb-node/src/api/changelog.rs
crates/gitlawb-node/src/api/pulls.rs
crates/gitlawb-node/src/api/repos.rs
crates/gitlawb-node/src/git/store.rs

🚧 Files skipped from review as they are similar to previous changes (2)

crates/gitlawb-node/src/api/changelog.rs
crates/gitlawb-node/src/api/pulls.rs

…t exact git diff --name-only quotes and escapes paths containing newlines, so they no longer match the visibility globs in get_pr_diff. On a public repo where / is allowed but a subtree is denied, a touched restricted path could be missed by the deny check and the full PR diff disclosed. Use -z and split on NUL to preserve exact path bytes.

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

Comment thread crates/gitlawb-node/src/api/changelog.rs Outdated

coderabbitai Bot reviewed Jun 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(node): enforce path-scoped visibility on the REST read API#52

fix(node): enforce path-scoped visibility on the REST read API#52
beardthelion wants to merge 3 commits into
Gitlawb:mainfrom
beardthelion:fix/rest-visibility-gate

beardthelion commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

beardthelion commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

Not covered here

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

beardthelion commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading