-
Notifications
You must be signed in to change notification settings - Fork 243
Add git reference support #1698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: maxcleme <maxime.clement@docker.com>
Signed-off-by: maxcleme <maxime.clement@docker.com>
Signed-off-by: maxcleme <maxime.clement@docker.com>
9c619e0 to
a596430
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review Summary
I've reviewed the git reference support implementation. The code is generally well-structured with good test coverage. I found one potential concurrency issue worth addressing.
Findings
1 MEDIUM severity issue identified - See inline comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Summary
Found 3 issues in the git reference handling code:
-
Git fetch doesn't specify branch for shallow clones - When updating an already-cloned repository, the fetch command doesn't specify which branch to fetch, which can cause failures when resetting to non-default branches.
-
Git reset assumes ref was fetched - The reset command can fail if the target ref wasn't included in the shallow fetch.
-
Incomplete error handling for repository detection - The code doesn't properly distinguish between "repository not found" and other errors like permission issues.
All three issues are in the new git source implementation and could cause runtime failures when working with git references.
Signed-off-by: maxcleme <maxime.clement@docker.com>
c10ecb9 to
e2b86ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Security Review Summary
This PR introduces git reference support, but the implementation has two critical security vulnerabilities that must be addressed:
- Git Option Injection (HIGH): The ref parameter (after
#) is not validated, allowing attackers to inject arbitrary git options like--upload-packwhich could enable remote code execution - Race Condition (HIGH): Concurrent access to the same git repository can cause corrupted clones or failed operations
Recommendations
For Git Option Injection:
- Add validation to ensure ref only contains safe characters (alphanumeric,
/,-,_,.) - Reject refs starting with
--to prevent option injection - Consider using a whitelist approach for allowed characters
For Race Condition:
- Use
sync.Onceor a mutex to ensure only one goroutine clones a given repository - Consider using file-based locking for cross-process protection
- Add tests for concurrent access scenarios
Please address these security issues before merging.
| } | ||
|
|
||
| // IsGitReference checks if the input is a valid Git repository reference. | ||
| func IsGitReference(input string) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 CRITICAL: Git Option Injection Vulnerability
The g.ref parameter is passed to git commands without validation. An attacker can inject arbitrary git options by crafting a malicious reference:
git@github.com:user/repo.git#--upload-pack=/tmp/evil
This would execute:
git clone --depth 1 --branch --upload-pack=/tmp/evil ...
Git interprets --upload-pack=/tmp/evil as a valid option (not a branch name), potentially enabling remote code execution.
Fix: Add validation to reject refs containing:
- Options starting with
-- - Newlines or null bytes
- Other unsafe characters
Recommended validation:
func isValidRef(ref string) bool {
if ref == "" {
return true
}
if strings.HasPrefix(ref, "--") || strings.HasPrefix(ref, "-") {
return false
}
// Allow only safe characters: alphanumeric, /, -, _, .
for _, r := range ref {
if !((r >= 'a' && r <= 'z') || (r >= 'A' && r <= 'Z') ||
(r >= '0' && r <= '9') || r == '/' || r == '-' ||
r == '_' || r == '.') {
return false
}
}
return true
}Apply this validation in ParseGitReference() before returning the ref.
| ref: ref, | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 CRITICAL: Race Condition in Cache Directory Access
Multiple concurrent calls to Read() with the same repo URL can both pass the .git directory check and both attempt to clone, causing:
- Failed git clone operations
- Corrupted repository state
- Partial/incomplete clones
The check at line 391 is non-atomic with the clone operation at line 397.
Example race:
Goroutine 1: os.Stat(.git) -> not found -> starts clone
Goroutine 2: os.Stat(.git) -> not found -> starts clone
Both goroutines now execute git clone to the same directory simultaneously.
Fix: Use synchronization to ensure only one goroutine clones a repository:
var (
cloneLocks sync.Map // map[string]*sync.Mutex
)
func (g *gitSource) Read(ctx context.Context) ([]byte, error) {
cacheDir := g.getCacheDir()
// Get or create a lock for this cache directory
lockI, _ := cloneLocks.LoadOrStore(cacheDir, &sync.Mutex{})
lock := lockI.(*sync.Mutex)
lock.Lock()
defer lock.Unlock()
// Now proceed with .git check and clone/fetch
gitPath := filepath.Join(cacheDir, ".git")
// ... rest of the function
}Alternatively, use sync.Once per cache directory to ensure clone happens exactly once.
Summary
This PR adds git reference support to cagent, enabling the tool to work with specific git references (branches, tags, commits) when interacting with repositories.
eg.
Changes
Type of Change