Skip to content

[Crossgen2] Add --dedup-il-bodies to deduplicate identical IL method bodies in R2R composite image#126047

Draft
kotlarmilos wants to merge 2 commits intodotnet:mainfrom
kotlarmilos:r2r-dedup-il-bodies-measure
Draft

[Crossgen2] Add --dedup-il-bodies to deduplicate identical IL method bodies in R2R composite image#126047
kotlarmilos wants to merge 2 commits intodotnet:mainfrom
kotlarmilos:r2r-dedup-il-bodies-measure

Conversation

@kotlarmilos
Copy link
Member

@kotlarmilos kotlarmilos commented Mar 24, 2026

Description

This PR adds a --dedup-il-bodies flag to crossgen2 that enables content-based deduplication of identical IL method bodies when emitting R2R composite component assemblies.

When --dedup-il-bodies is passed, crossgen2 uses a content-keyed ConcurrentDictionary<byte[], CopiedMethodILNode> per component factory. When a method's IL body bytes match an already-seen body, the existing CopiedMethodILNode is reused, causing multiple MethodDef RVA entries to point to the same IL body blob.

Impact (MAUI HelloWorld iOS arm64)

Metric Value
Total IL bodies 105,152
Deduplicated 12,874 (12.2%)
Bytes saved ~155 KB

…bodies in R2R composite images

Add a --dedup-il-bodies flag to crossgen2 that enables content-based
deduplication of identical IL method bodies when emitting R2R composite
component assemblies.

When enabled, crossgen2 uses a content-keyed dictionary per component
factory. When a method's IL body bytes match an already-seen body, the
existing CopiedMethodILNode is reused, causing multiple MethodDef RVA
entries to point to the same IL body blob.

Enable by default on Apple mobile platforms (ios, tvos, iossimulator,
tvossimulator, maccatalyst) via MSBuild targets, matching the pattern
used for --strip-inlining-info and --strip-debug-info.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 24, 2026 18:04
@kotlarmilos kotlarmilos changed the title [Crossgen2] Add --dedup-il-bodies to deduplicate identical IL method bodies in R2R composite images [Crossgen2] Add --dedup-il-bodies to deduplicate identical IL method bodies in R2R composite image Mar 24, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in crossgen2 optimization (--dedup-il-bodies) to content-deduplicate identical IL method bodies when producing R2R composite component assemblies, and wires it through the CLI and Apple mobile MSBuild defaults to reduce output size.

Changes:

  • Add --dedup-il-bodies CLI option and plumb it into NodeFactoryOptimizationFlags.
  • Implement IL-body content deduplication in the ReadyToRun node factory using a content-keyed cache.
  • Enable the flag by default for Apple mobile RIDs via Microsoft.NET.CrossGen.targets.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/tasks/Crossgen2Tasks/Microsoft.NET.CrossGen.targets Enables --dedup-il-bodies by default for iOS/tvOS simulators and maccatalyst publish R2R.
src/coreclr/tools/aot/crossgen2/Properties/Resources.resx Adds localized description string for the new CLI option.
src/coreclr/tools/aot/crossgen2/Program.cs Wires the CLI option into NodeFactoryOptimizationFlags.DedupILBodies.
src/coreclr/tools/aot/crossgen2/Crossgen2RootCommand.cs Declares and registers the new --dedup-il-bodies option.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRunCodegenNodeFactory.cs Adds the dedup cache and gates dedup behavior behind DedupILBodies.
src/coreclr/tools/aot/ILCompiler.ReadyToRun/Compiler/DependencyAnalysis/ReadyToRun/CopiedMethodILNode.cs Adds helper to read raw method body bytes for content-based dedup keys.

@kotlarmilos kotlarmilos added this to the 11.0.0 milestone Mar 24, 2026
@jkotas
Copy link
Member

jkotas commented Mar 24, 2026

Can this be on by default? I do not think it needs to be an option.

Copy link
Member

@jkoritzinsky jkoritzinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead use/enhance ObjectDataInterner from ILC (and enable it in crossgen2) to do the deduplicating for us during emit?

Remove the DedupILBodies flag from NodeFactoryOptimizationFlags, the
--dedup-il-bodies CLI option, the MSBuild PublishReadyToRunDedupILBodies
properties, and the resource string. Deduplication of identical copied
IL method bodies is now unconditional.

Restructure CopiedMethodIL to use a factory function in the dedup
dictionary to avoid creating unused per-method nodes when bodies
deduplicate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@kotlarmilos
Copy link
Member Author

/azp run runtime-coreclr crossgen2,runtime-coreclr crossgen2-composite

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@kotlarmilos
Copy link
Member Author

Could we instead use/enhance ObjectDataInterner from ILC (and enable it in crossgen2) to do the deduplicating for us during emit?

My understanding of the ObjectDataInterner: it is designed for folding compiled native method bodies where it compares code + relocations + other info and runs in iterations because folding can enabling further folds. For the dedup none of that applies.

Enabling it in crossgen2 would require more changes. I think it can be moved, I just want to check if that was the intention.

}

private NodeCache<MethodDesc, CopiedMethodILNode> _copiedMethodIL;
private readonly ConcurrentDictionary<byte[], CopiedMethodILNode> _copiedMethodILDedup = new(ByteArrayComparer.Instance);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both _copiedMethodIL and _copiedMethodILDedup?

If I am reading this correctly, it should be enough to have byte[] -> Node mapping.

@jkotas
Copy link
Member

jkotas commented Mar 25, 2026

ObjectDataInterner

I think it would be overkill for what we need here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants