-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fix Go extractor silent failures and improve error recovery #20676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you for this contribution. I will try it out. Can you give any examples of projects which benefit from it? |
Hello, a multi module, massive scale OSS project would benefit from this. Good examples: Observe that on old version, the old code would stop the entire scan. With this change, the tool will log the error for the problematic file, and move on, allowing analysis to continue despite OOM errors. Find go.mod then run "go build -mod=vendor ./..." or -mod=mod depending on project's handling of vendor dir (as the codeql database build trace command). Monitor build tracer logs for evidence that extractor continues (or exits) depending on this fix being present. |
|
Are you sure it matters if we use autobuild? I believe this code is run the same either way. |
The extractor was calling log.Fatal() when file extraction failed, causing the entire extraction process to terminate silently on the first error. This was particularly problematic for OOM errors which would only appear in build-tracer logs, not in the extractor output. Changes: - Replace log.Fatal() with log.Printf() to log errors without terminating - Add panic recovery in file extraction goroutines to catch OOM errors - Move cleanup (semaphore release, WaitGroup.Done) into defer block to ensure proper cleanup even when panics occur This allows extraction to continue processing remaining files when individual files fail, and ensures errors are visible in extractor logs.
3e54fad to
6e4dc04
Compare
You're correct, the extractPackage() code path runs the same way regardless of whether autobuild is used. My suggestion to skip autobuild was about having more control over the test environment, not because the code behaves differently. To clarify the reproduction steps:
cc @owen-mc |
|
The code changes seem reasonable. My main question is whether the resulting databases are missing too much information to be usable or not. Are you able to share a database created with this patch where some files have OOMed? (Having the name of the files which OOMed would be helpful.) |
Summary
Fixes silent failures in the Go extractor where OOM errors and file extraction failures would cause the entire extraction process to terminate without proper error logging.
Problem
The extractor was using
log.Fatal()when file extraction failed, which caused:This was particularly problematic for:
Solution
Minimal changes to improve error resilience:
log.Fatal()withlog.Printf()- log errors but continue extractionBenefits
Testing
Tested on large codebase (700+ packages) with memory constraints. Before fix: extraction would fail silently on OOM. After fix: extraction continues and logs specific files that failed.
Impact
Files Changed
go/extractor/extractor.go: UpdatedextractPackage()error handling (lines 694-707)