[Metal] Fix top level argument buffer indexing when creating descriptors by kcloudy0717 · Pull Request #998 · llvm/offload-test-suite

kcloudy0717 · 2026-03-20T14:58:38Z

I looked deeper into why some tests fails on Metal and I think found the reason why.

The Metal backend generates a automatic linear layout of the resources into the top-level argument buffer, but the actual heap index of the resources seems to be sorted by the resource type in the metal shader reflection. Given the following declarations:

StructuredBuffer<int16_t4> In: register(t0);
RWStructuredBuffer<int16_t4> Out1 : register(u1);
RWStructuredBuffer<int16_t4> Out2 : register(u2);
RWStructuredBuffer<int16_t4> Out3 : register(u3);
RWStructuredBuffer<int16_t4> Out4 : register(u4);

StructuredBuffer<uint16_t4> UIn: register(t5);
RWStructuredBuffer<uint16_t4> UOut1 : register(u6);
RWStructuredBuffer<uint16_t4> UOut2 : register(u7);
RWStructuredBuffer<uint16_t4> UOut3 : register(u8);
RWStructuredBuffer<uint16_t4> UOut4 : register(u9);

when compiled with -metal and -Fre options, we get the following in the metal json reflection:

{
  ...
  "TopLevelArgumentBuffer": [
    { "EltOffset": 0, "Name": "", "Size": 24, "Slot": 0, "Space": 0, "Type": "SRV" },
    { "EltOffset": 24, "Name": "", "Size": 24, "Slot": 5, "Space": 0, "Type": "SRV" },
    { "EltOffset": 48, "Name": "", "Size": 24, "Slot": 1, "Space": 0, "Type": "UAV" },
    { "EltOffset": 72, "Name": "", "Size": 24, "Slot": 2, "Space": 0, "Type": "UAV" },
    { "EltOffset": 96, "Name": "", "Size": 24, "Slot": 3, "Space": 0, "Type": "UAV" },
    { "EltOffset": 120, "Name": "", "Size": 24, "Slot": 4, "Space": 0, "Type": "UAV" },
    { "EltOffset": 144, "Name": "", "Size": 24, "Slot": 6, "Space": 0, "Type": "UAV" },
    { "EltOffset": 168, "Name": "", "Size": 24, "Slot": 7, "Space": 0, "Type": "UAV" },
    { "EltOffset": 192, "Name": "", "Size": 24, "Slot": 8, "Space": 0, "Type": "UAV" },
    { "EltOffset": 216, "Name": "", "Size": 24, "Slot": 9, "Space": 0, "Type": "UAV" }
  ],
  ...
}

From the reflection data we can see In and UIn SRV resource descriptors should be created at index 0 and 1, but this is not the case in the backend because it assumes heap index is 1:1 to the order they are declared in the DescriptorSets -> Resources field.

The PR fixes this by adding a hash map using Slot and Space values in the reflection as keys that maps to the actual index of the top level argument buffer when creating descriptors.

With the above changes, I get the following on my local machine:

Unexpectedly Passed Tests (16):
  OffloadTest-mtl :: Feature/ResourcesInStructs/res-in-struct-mix.test
  OffloadTest-mtl :: WaveOps/QuadReadAcrossX.32.test
  OffloadTest-mtl :: WaveOps/QuadReadAcrossX.int16.test
  OffloadTest-mtl :: WaveOps/WaveActiveMax.fp16.test
  OffloadTest-mtl :: WaveOps/WaveActiveMax.fp32.test
  OffloadTest-mtl :: WaveOps/WaveActiveMax.int16.test
  OffloadTest-mtl :: WaveOps/WaveActiveMax.int32.test
  OffloadTest-mtl :: WaveOps/WaveActiveSum.int16.test
  OffloadTest-mtl :: WaveOps/WaveActiveSum.int32.test
  OffloadTest-mtl :: WaveOps/WavePrefixSum.int16.test
  OffloadTest-mtl :: WaveOps/WaveReadLaneAt.16.test
  OffloadTest-mtl :: WaveOps/WaveReadLaneAt.32.test
  OffloadTest-mtl :: WaveOps/WaveReadLaneFirst.fp16.test
  OffloadTest-mtl :: WaveOps/WaveReadLaneFirst.fp32.test
  OffloadTest-mtl :: WaveOps/WaveReadLaneFirst.int16.test
  OffloadTest-mtl :: WaveOps/WaveReadLaneFirst.int32.test

kcloudy0717 · 2026-03-20T15:04:32Z

This should resolve: #642, #989, #393, #960, #351

kcloudy0717 · 2026-03-20T16:07:38Z

Pinging @bob80905, @farzonl, @llvm-beanz for review.

kcloudy0717 · 2026-03-20T16:09:46Z

Would it also be possible to integrate metal_irconverter in third-party so their reflection API can be used? Now there are two occasions (this PR, as well as reading tg_size) that we need to manually parse the json which could be prone to errors.

kcloudy0717 · 2026-03-21T05:26:09Z

Thinking about this more, I think we should consider using the explicit root signature binding model from the metal shader converter as that will give us most flexibility when configuring resources. Using the explicit model will make future tests such as unbound resource arrays or dynamic resource indexing via ResourceDescriptorHeap/SamplerDescriptorHeap easier.

Metal's explicit root signature binding model is exactly the same as D3D12's root signature model, the only caveat is that we need to convert DXIL -> AIR on the fly because setting the root signature is only available during shader conversion through either the API or command line. See https://developer.apple.com/metal/shader-converter/#binding-model for more info.

damyanp · 2026-03-23T17:58:27Z

@kcloudy0717 - this PR is marked as draft. If you're ready for it to be reviewed, can you click the "Ready for review" button please?

manon-traverse · 2026-03-31T18:24:42Z

+// From metal_irconverter.h which is not part of the third-party
+enum IRResourceType {


Should we include metal_irconvert.h as a dependency in third-party?

That would be preferred. This task was originally for looking into mac issues, adding a whole 3rd party library probably should in a separate task?

manon-traverse · 2026-03-31T18:34:18Z

-          if (auto Err = createDescriptor(R, IS, HeapIndex++))
-            return Err;
+
+      if (P.isCompute()) {


Is this only applicable to compute pipelines? I would imagine the same issue would pop up for graphics and ray tracing pipelines.

No, it's applicable to other stages too, but that was something I didn't look into because the reflection generated is separate. Ideally we use explicit root signature binding model when using the Metal Shader Converter as suggested in #998 (comment).

…s for different types Tests that are failing have same register but they are separate resource types (SRV vs UAV), the TLABIndexMap didn't account for that which caused resources with same register & space to be created in the same location.

bogner · 2026-04-02T16:51:49Z

Would it also be possible to integrate metal_irconverter in third-party so their reflection API can be used? Now there are two occasions (this PR, as well as reading tg_size) that we need to manually parse the json which could be prone to errors.

Yes, we can and should integrate metal_irconverter

Thinking about this more, I think we should consider using the explicit root signature binding model from the metal shader converter as that will give us most flexibility when configuring resources. Using the explicit model will make future tests such as unbound resource arrays or dynamic resource indexing via ResourceDescriptorHeap/SamplerDescriptorHeap easier.

I agree here too. Explicit root signatures is a better mapping of how these tests are written than trying to remap the bindings like we're doing here, and I would prefer that approach. Are you willing to take that on? Should we abandon this PR in preference of using that approach, or does it make sense to go with this in the short term and follow up with that later?

kcloudy0717 · 2026-04-02T17:16:16Z

Would it also be possible to integrate metal_irconverter in third-party so their reflection API can be used? Now there are two occasions (this PR, as well as reading tg_size) that we need to manually parse the json which could be prone to errors.

Yes, we can and should integrate metal_irconverter

Thinking about this more, I think we should consider using the explicit root signature binding model from the metal shader converter as that will give us most flexibility when configuring resources. Using the explicit model will make future tests such as unbound resource arrays or dynamic resource indexing via ResourceDescriptorHeap/SamplerDescriptorHeap easier.

I agree here too. Explicit root signatures is a better mapping of how these tests are written than trying to remap the bindings like we're doing here, and I would prefer that approach. Are you willing to take that on? Should we abandon this PR in preference of using that approach, or does it make sense to go with this in the short term and follow up with that later?

Sure, I can take on implementing the explicit root signature approach. I think we can abandon this PR, this PR served it's purpose on figuring out why most of the metal tests are failing. I don't think it make sense for this PR to be merged in the short term since once we integrated the RS approach, the code would most likely be removed.

kcloudy0717 · 2026-04-02T17:18:15Z

Closing this PR in favor of implementing the explicit RS approach for Metal shader converter that I'll be working on.

bogner · 2026-04-02T17:32:30Z

I've gone ahead and reworded #351 to explain what's actually happening here and duplicated #642, #989, #393, and #960 to it.

kcloudy0717 marked this pull request as ready for review March 23, 2026 18:03

kcloudy0717 force-pushed the kcloudy0717/metal/TopLevelArgumentBuffer branch from 0784078 to 7fa395b Compare March 31, 2026 16:50

manon-traverse reviewed Mar 31, 2026

View reviewed changes

kcloudy0717 added 4 commits April 1, 2026 23:14

[Metal] Fix top level argument buffer indexing when creating descriptors

ee20060

Removed XFAILs for passing tests

c6b9c52

Removed XFAILs for passing tests

d807b29

kcloudy0717 force-pushed the kcloudy0717/metal/TopLevelArgumentBuffer branch from ae9446d to d807b29 Compare April 1, 2026 15:14

kcloudy0717 closed this Apr 2, 2026

		// From metal_irconverter.h which is not part of the third-party
		enum IRResourceType {

Conversation

kcloudy0717 commented Mar 20, 2026

Uh oh!

kcloudy0717 commented Mar 20, 2026

Uh oh!

kcloudy0717 commented Mar 20, 2026

Uh oh!

kcloudy0717 commented Mar 20, 2026

Uh oh!

kcloudy0717 commented Mar 21, 2026

Uh oh!

damyanp commented Mar 23, 2026

Uh oh!

manon-traverse Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

kcloudy0717 Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

manon-traverse Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

kcloudy0717 Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

bogner commented Apr 2, 2026

Uh oh!

kcloudy0717 commented Apr 2, 2026

Uh oh!

kcloudy0717 commented Apr 2, 2026

Uh oh!

bogner commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants