Skip to content

Signal when SPIR-V legalization needs loop unroll#15

Draft
AnastaZIuk wants to merge 2 commits intomainfrom
unroll
Draft

Signal when SPIR-V legalization needs loop unroll#15
AnastaZIuk wants to merge 2 commits intomainfrom
unroll

Conversation

@AnastaZIuk
Copy link
Member

@AnastaZIuk AnastaZIuk commented Mar 20, 2026

Summary

  • add dedicated SpirvEmitter bits for legalization-time loop unroll and legalization-time SSA rewrite
  • set those bits only for the known lowering paths that still need those expensive transforms for correctness
  • pass the bits into the narrowed SPIR-V legalization overload
  • update a few CodeGenSPIRV checks so they validate the semantically relevant SPIR-V shape instead of the older inflated IR shape
  • update the nested SPIR-V submodule pointers so this branch materializes the validated companion optimizer state
  • use the DXC trunk Godbolt reproducer, which is the preprocessed output of our path tracer at about 58k LoC

Root cause

The expensive blanket policy lives in the SPIR-V optimizer recipe, not in HLSL parsing or the DXC CLI front-end.

DXC is still the producer that knows when a specific lowering path genuinely needs those expensive legalization transforms for correctness, so the optimizer should not have to guess that globally.

Two confirmed correctness-sensitive cases are:

  • variable image sample offsets, which still need materialized loop unroll to legalize into forms such as ConstOffset
  • combined image sampler conversion, which does not need loop unroll but still needs legalize-time SSA rewrite cleanup to avoid invalid SPIR-V such as storing sampled image types

This patch provides that producer-side signal. The companion optimizer change in KhronosGroup/SPIRV-Tools#6612 then uses it to avoid the broader blanket defaults.

Validation

  • reproducer: godbolt.org/z/o5xf1hq36 (note: Compiler Explorer cache can make repeated runs look much faster than a cold compile)
  • shader payload: preprocessed output of our path tracer at about 58k LoC
  • local machine: AMD Ryzen 5 5600G with Radeon Graphics, 6 physical cores, 12 logical processors, Windows-reported max clock 3901 MHz
  • on the same payload and the same machine, SPIRV-Tools@487ff843bd8a + DXC@bd9a8b1c5365 reduced the workload from 19.161 s to 6.042 s
  • with SPIRV-Tools@57007cf46bb4 + DXC@b02b772e0b50, the same payload measured 4.702 s
  • with the current branch pair SPIRV-Tools@7134be5024ff + DXC@55112e338fd2, the same payload now measures 2.464 s
  • full local CodeGenSPIRV lit/FileCheck passes: 1403 expected passes, 2 expected failures, 0 unexpected

Companion SPIR-V optimizer PR:
KhronosGroup/SPIRV-Tools#6612

@github-actions
Copy link

github-actions bot commented Mar 20, 2026

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 54afd2c6a5f1a5bcbef66967515fa69f5a3650fd 9ac57982725c4264348b20ad3fb271acc0993754 -- tools/clang/lib/Frontend/Rewrite/RewriteObjC.cpp tools/clang/lib/SPIRV/SpirvEmitter.cpp tools/clang/lib/SPIRV/SpirvEmitter.h
View the diff from clang-format here.
diff --git a/tools/clang/lib/SPIRV/SpirvEmitter.cpp b/tools/clang/lib/SPIRV/SpirvEmitter.cpp
index 4a616a00..0920fc1c 100644
--- a/tools/clang/lib/SPIRV/SpirvEmitter.cpp
+++ b/tools/clang/lib/SPIRV/SpirvEmitter.cpp
@@ -593,8 +593,7 @@ SpirvEmitter::SpirvEmitter(CompilerInstance &ci)
       constEvaluator(astContext, spvBuilder), entryFunction(nullptr),
       curFunction(nullptr), curThis(nullptr), seenPushConstantAt(),
       isSpecConstantMode(false), needsLegalization(false),
-      needsLegalizationLoopUnroll(false),
-      needsLegalizationSsaRewrite(false),
+      needsLegalizationLoopUnroll(false), needsLegalizationSsaRewrite(false),
       beforeHlslLegalization(false), mainSourceFile(nullptr) {
 
   // Get ShaderModel from command line hlsl profile option.
@@ -957,8 +956,7 @@ void SpirvEmitter::HandleTranslationUnit(ASTContext &context) {
       !dsetbindingsToCombineImageSampler.empty() ||
       spirvOptions.signaturePacking;
   needsLegalizationSsaRewrite =
-      needsLegalizationSsaRewrite ||
-      !dsetbindingsToCombineImageSampler.empty();
+      needsLegalizationSsaRewrite || !dsetbindingsToCombineImageSampler.empty();
 
   // Run legalization passes
   if (spirvOptions.codeGenHighLevel) {
@@ -16679,9 +16677,9 @@ bool SpirvEmitter::spirvToolsLegalize(std::vector<uint32_t> *mod,
   } else if (needsLegalizationSsaRewrite) {
     legalizationSsaRewriteMode = spvtools::SSARewriteMode::OpaqueOnly;
   }
-  optimizer.RegisterLegalizationPasses(
-      spirvOptions.preserveInterface, needsLegalizationLoopUnroll,
-      legalizationSsaRewriteMode);
+  optimizer.RegisterLegalizationPasses(spirvOptions.preserveInterface,
+                                       needsLegalizationLoopUnroll,
+                                       legalizationSsaRewriteMode);
   // Add flattening of resources if needed.
   if (spirvOptions.flattenResourceArrays) {
     optimizer.RegisterPass(
  • Check this box to apply formatting changes to this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant