[Crossgen2] Add --dedup-il-bodies to deduplicate identical IL method bodies in R2R composite image#126047
[Crossgen2] Add --dedup-il-bodies to deduplicate identical IL method bodies in R2R composite image#126047kotlarmilos wants to merge 2 commits intodotnet:mainfrom
--dedup-il-bodies to deduplicate identical IL method bodies in R2R composite image#126047Conversation
…bodies in R2R composite images Add a --dedup-il-bodies flag to crossgen2 that enables content-based deduplication of identical IL method bodies when emitting R2R composite component assemblies. When enabled, crossgen2 uses a content-keyed dictionary per component factory. When a method's IL body bytes match an already-seen body, the existing CopiedMethodILNode is reused, causing multiple MethodDef RVA entries to point to the same IL body blob. Enable by default on Apple mobile platforms (ios, tvos, iossimulator, tvossimulator, maccatalyst) via MSBuild targets, matching the pattern used for --strip-inlining-info and --strip-debug-info. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
--dedup-il-bodies to deduplicate identical IL method bodies in R2R composite image
There was a problem hiding this comment.
Pull request overview
This PR adds an opt-in crossgen2 optimization (--dedup-il-bodies) to content-deduplicate identical IL method bodies when producing R2R composite component assemblies, and wires it through the CLI and Apple mobile MSBuild defaults to reduce output size.
Changes:
- Add
--dedup-il-bodiesCLI option and plumb it intoNodeFactoryOptimizationFlags. - Implement IL-body content deduplication in the ReadyToRun node factory using a content-keyed cache.
- Enable the flag by default for Apple mobile RIDs via
Microsoft.NET.CrossGen.targets.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tasks/Crossgen2Tasks/Microsoft.NET.CrossGen.targets | Enables --dedup-il-bodies by default for iOS/tvOS simulators and maccatalyst publish R2R. |
| src/coreclr/tools/aot/crossgen2/Properties/Resources.resx | Adds localized description string for the new CLI option. |
| src/coreclr/tools/aot/crossgen2/Program.cs | Wires the CLI option into NodeFactoryOptimizationFlags.DedupILBodies. |
| src/coreclr/tools/aot/crossgen2/Crossgen2RootCommand.cs | Declares and registers the new --dedup-il-bodies option. |
| src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs | Adds the dedup cache and gates dedup behavior behind DedupILBodies. |
| src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/CopiedMethodILNode.cs | Adds helper to read raw method body bytes for content-based dedup keys. |
.../tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs
Outdated
Show resolved
Hide resolved
.../tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs
Outdated
Show resolved
Hide resolved
.../tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs
Outdated
Show resolved
Hide resolved
.../tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs
Outdated
Show resolved
Hide resolved
|
Can this be on by default? I do not think it needs to be an option. |
jkoritzinsky
left a comment
There was a problem hiding this comment.
Could we instead use/enhance ObjectDataInterner from ILC (and enable it in crossgen2) to do the deduplicating for us during emit?
Remove the DedupILBodies flag from NodeFactoryOptimizationFlags, the --dedup-il-bodies CLI option, the MSBuild PublishReadyToRunDedupILBodies properties, and the resource string. Deduplication of identical copied IL method bodies is now unconditional. Restructure CopiedMethodIL to use a factory function in the dedup dictionary to avoid creating unused per-method nodes when bodies deduplicate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/azp run runtime-coreclr crossgen2,runtime-coreclr crossgen2-composite |
|
Azure Pipelines successfully started running 2 pipeline(s). |
My understanding of the ObjectDataInterner: it is designed for folding compiled native method bodies where it compares code + relocations + other info and runs in iterations because folding can enabling further folds. For the dedup none of that applies. Enabling it in crossgen2 would require more changes. I think it can be moved, I just want to check if that was the intention. |
| } | ||
|
|
||
| private NodeCache<MethodDesc, CopiedMethodILNode> _copiedMethodIL; | ||
| private readonly ConcurrentDictionary<byte[], CopiedMethodILNode> _copiedMethodILDedup = new(ByteArrayComparer.Instance); |
There was a problem hiding this comment.
Do we need both _copiedMethodIL and _copiedMethodILDedup?
If I am reading this correctly, it should be enough to have byte[] -> Node mapping.
I think it would be overkill for what we need here. |
Description
This PR adds a
--dedup-il-bodiesflag to crossgen2 that enables content-based deduplication of identical IL method bodies when emitting R2R composite component assemblies.When
--dedup-il-bodiesis passed, crossgen2 uses a content-keyedConcurrentDictionary<byte[], CopiedMethodILNode>per component factory. When a method's IL body bytes match an already-seen body, the existingCopiedMethodILNodeis reused, causing multiple MethodDef RVA entries to point to the same IL body blob.Impact (MAUI HelloWorld iOS arm64)