[package-extractor]: fix issue #5719, preventing duplicate-copy conflicts across npm-packlist versions#5720
Conversation
…m-packlist versions
ed0977b to
bce3791
Compare
c9ede61 to
316cd18
Compare
| // in the AssetHandler.ts includeAssetAsync method. | ||
| // To avoid this issue, we will normalize the file paths to be relative to the package root and remove duplicates. | ||
| const normalizedFiles: string[] = Array.from(new Set(npmPackFiles.map((file) => path.normalize(file)))); | ||
| return normalizedFiles; |
There was a problem hiding this comment.
Key questions for this change:
-
Is this a perf concern? We are synchronously calling
path.normalize()potentially thousands of times, and allocating multiple data structures containing thousands of elements. How many thousands? It is not the full--create-archivelist fromrush deploy, but more like the total number of files under a single Rush project. An imperative loop here could be more efficient for time/memory, but the costs are probably already tiny compared to the preceding work required to constructnpmPackFiles. -
Is it the right fix? The problem did not exist with
npm-packlist@2, it is apparentlynpm-packlist@5that started returning duplicate paths. Why should a packlist contain duplicates? Aren't duplicate paths a bug? If so, then this "fix" is reversing the effects of a bug, which wouldn't be necessary if we simply fix the bug. (More evidence:npm-packlistis maintained by NPM, Inc., who are known for "how can this have no unit test???" type regressions.)If it's a bug, then other tools that use
npm-packlistshould have the same bug. And here we go, PNPM seems to have encountered the same issue:fix: packing the same file twice during publish pnpm/pnpm#7250
That diff interestingly has a similar deduplication line:
const files = Array.from(new Set((await packlist(src)).map((f) => path.join(f))))
...but if you look closely, it is not calling
path.normalize(). It seems to be deduplicating for some other reason. PNPM's real fix was to upgradenpm-packlist:The fix was to update
npm-packlistto the latest version....which at the time, they accomplished by introducing a wrapper package
@pnpm/fs.packlist.
There was a problem hiding this comment.
We can't downgrade, because that would undo the benefit of PR #5575.
So, one alternative fix is to upgrade to npm-packlist (whose latest version is now already 10.x).
BUT, consider that PNPM's @pnpm/fs.packlist has eliminated its dependency on arborist, so today it is merely:
- A hardwired dependency on known-good
"npm-packlist": "5.1.3" - A small amount of normalizing/deduplicating logic in index.ts.
This seems like exactly what rush deploy needs. And Rush primarily uses PNPM, I think we would get more consistent/stable results by replacing our npm-packlist with @pnpm/fs.packlist. That seems like the best solution.
There was a problem hiding this comment.
|
🚀 Rush 5.172.1 has been published with this fix. |
Summary
Fixes #5719 .
Details
glob@7 with leading dot-slash pattern won't return the same path files.
But, glob@8 with leading dot-slash pattern will return a file path prefixed with "./".
Than, we will got two file paths reference to the same file.
This will cause
EEXIST: file already existserror when running withCOPYFILE_EXCLmode.This PR uses
path.normalizeto dedup the file paths.How it was tested