feat(cpp-phase2): PR-06 validation-driven bug fixes#718
Open
shivasurya wants to merge 3 commits into
Open
Conversation
IsPrivateSymbol indexes name[1] after a HasPrefix("_") check, but the
length check only excluded the empty string. A single-underscore symbol
name (length 1) passed the empty check, passed the underscore-prefix
check, then panicked on the index access.
This surfaced when generating Windows registries from real Ubuntu mingw
headers — one of the Win32 SDK headers declares a symbol literally named
"_" via a #define, and the extractor crashed the whole generator run.
Fix: tighten the early return to len(name) < 2 so the index access is
always safe.
Validation: Linux C registry grew from ~9,205 → ~9,970 functions after
this fix (the panic was silently aborting some glibc header extractions
mid-walk; the recovery surfaces ~750 more symbols).
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
PR-03 probed for the mingw libstdc++ tree under
/usr/x86_64-w64-mingw32/include/c++/<version> — the upstream mingw
sysroot layout. Debian/Ubuntu's apt-packaged g++-mingw-w64 splits the
C++ headers under /usr/lib/gcc/x86_64-w64-mingw32/<ver>-<thread>/
include/c++/ instead. Result: every Windows C++ generation on a Ubuntu
runner failed with "no mingw libstdc++ headers" even when g++-mingw-w64
was installed.
windowsCppSource now probes both layouts via a new findWindowsMingwCppDir
helper:
1. Upstream sysroot layout (preserved for compatibility).
2. Ubuntu split layout. Scans /usr/lib/gcc/x86_64-w64-mingw32/ for
<ver>-<thread> subdirs, prefers the posix-threading variant over
win32-threading.
SystemTag gains a layout tag suffix (-upstream / -<thread>-ubuntu) so
downstream tools can distinguish the source. Two new tests pin both
layouts; existing tests updated to reflect the suffix.
Validated against real Ubuntu 24.04 mingw install: windows/cpp manifest
generated successfully (121 headers, 504 classes, tag
mingw-w64-libstdc++-13-posix-ubuntu).
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The PR-03 HTTP loader's headerURL preferred the manifest's per-entry URL field over the loader's baseURL. Combined with the generator's default --base-url=https://assets.codepathfinder.dev/registries, this meant every manifest baked in production CDN URLs — so any test, local HTTP server, staging deploy, or operator-provided --stdlib-base-url override silently hit the production CDN instead. Symptom found during local validation: pathfinder against a local http://127.0.0.1:8765 server hung indefinitely because the loader fetched the manifest from localhost (200 OK), then read entry.URL from the manifest, then tried to fetch each per-header JSON from https://assets.codepathfinder.dev/... (which doesn't serve them). Fix: headerURL on both the C and C++ loaders always constructs the URL from the loader's baseURL + entry.File. The entry.URL field is preserved in the schema for forward compatibility (mirrors, CDN inspection) but the loader doesn't consume it. Tests updated: TestC*StdlibRegistry_HTTP_ManifestEmbeddedURL renamed to ManifestEmbeddedURLIgnored, fixture URL pointed at a non-routable host so a regression would manifest as a hang instead of false success. Validation: HTTP loader now matches file:// loader within ≤0.6pp on redis (67.9% vs 67.3%) and proxygen (28.0% vs 28.2%) — wire- equivalent behavior proven against the same registries served via both transports. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
SafeDep Report SummaryNo dependency changes detected. Nothing to scan. This report is generated by SafeDep Github App |
Code Pathfinder Security ScanNo security issues detected.
Powered by Code Pathfinder |
Pathfinder Report✅ No security findings on the changed files. This pull request is clean. Powered by Code Pathfinder. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## shiva/cpp-phase2-pr05-c-transitive-fallback #718 +/- ##
============================================================================
Coverage 85.69% 85.69%
============================================================================
Files 203 203
Lines 29212 29237 +25
============================================================================
+ Hits 25032 25054 +22
- Misses 3207 3209 +2
- Partials 973 974 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Three independent bug fixes uncovered while running PR-01..05 end-to-end against real Ubuntu mingw + a local HTTP server (the validation flow we set up before pushing anything to the production CDN). Each commit is one fix, self-contained.
Stacked on PR-05.
What's in here
1.
fix(generator)—IsPrivateSymbolpanic on length-1 namesIsPrivateSymbolindexedname[1]after checkinglen(name) > 0— a single-underscore symbol ("_", length 1) crashed the generator mid-walk. Surfaced on real Win32 SDK headers; tightened the early-return guard.Side effect: Linux C registry grew 9,205 → 9,970 functions because the panic was silently aborting some glibc header extractions.
2.
fix(generator)— Ubuntu mingw libstdc++ layoutThe PR-03 Windows C++ walker probed
/usr/x86_64-w64-mingw32/include/c++/<version>(upstream mingw sysroot layout). Debian/Ubuntu'sapt install g++-mingw-w64instead puts libstdc++ at/usr/lib/gcc/x86_64-w64-mingw32/<ver>-<thread>/include/c++/. Every Windows C++ generation on a Ubuntu runner was failing with "no mingw libstdc++ headers" even with the toolchain installed.findWindowsMingwCppDirnow probes both layouts, preferring the posix-threading variant. SystemTag carries a layout suffix so downstream can tell them apart.Validated against real Ubuntu 24.04 mingw install:
windows/cpp/v1manifest generated successfully — 121 headers, 504 classes, tagmingw-w64-libstdc++-13-posix-ubuntu.3.
fix(registry)—--stdlib-base-urlis now the single source of truthThe PR-03 HTTP loader's
headerURLpreferred the manifest's per-entryURLfield over the loader'sbaseURL. The generator's default--base-urlis the production CDN URL, so every manifest baked in production CDN URLs. Result: any--stdlib-base-urloverride (test, local server, staging deploy) silently bypassed the override and hit the production CDN.Symptom: pathfinder against a local
http://127.0.0.1:8765server hung indefinitely because the loader fetched the manifest from localhost, then tried to fetch each per-header JSON fromhttps://assets.codepathfinder.dev/...(which doesn't serve them yet).Fix:
headerURLalways constructs frombaseURL+entry.File.entry.URLstays in the schema for forward compatibility (mirrors, CDN inspection) but the loader doesn't consume it. Two tests pinning the old behavior renamed + repurposed to assert the new contract.Validation results
HTTP loader vs file:// loader on the same registries, served via Python
http.serverfrom$HOME/.test/cpf-clike-registries/:≤0.6 pp delta is within run-to-run noise. The HTTP loader is wire-equivalent to file://.
Why this matters
This entire PR exists because we ran the validation flow locally (mingw install + local HTTP server) before pushing anything to R2/CDN. The three bugs would have surfaced in production otherwise:
Local validation paid for itself three times over.
Verification
gradle buildGo— cleango test ./...— all packages passgolangci-lint run ./...— 0 issuesTest plan
🤖 Generated with Claude Code