Skip to content

Add WaveActiveBitXor tests#971

Merged
bob80905 merged 22 commits intollvm:mainfrom
bob80905:add_waveactivebitxor_tests
Apr 3, 2026
Merged

Add WaveActiveBitXor tests#971
bob80905 merged 22 commits intollvm:mainfrom
bob80905:add_waveactivebitxor_tests

Conversation

@bob80905
Copy link
Copy Markdown
Contributor

This PR adds tests for WaveActiveBitXor.
WaveActiveBitXor only accepts uint and uint64 types.
The PR also adds a control flow test, since this operation is convergent.
Fixes #896

Copy link
Copy Markdown
Contributor

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine, but had a suggestion to make the constant case less of a no-op test, and suggest trying 3 active threads for a non-constant scalar.

Out4[TID.x] = WaveActiveBitXor(V);

// constant folding uint4
Out5[TID.x] = WaveActiveBitXor(uint4(1,2,3,4));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even number of identical values XOR'd together will always be zero, and expecting all zero doesn't really test much.

There are also no tests that include an odd number of active lanes, and ensure that a bit set in each lane is set after that.

You could do something more interesting, like:

Suggested change
Out5[TID.x] = WaveActiveBitXor(uint4(1,2,3,4));
Out5[TID.x] = WaveActiveBitXor(uint4(1,2,3,4));
if (TID.x != 1)
Out5[TID.x + 4] = WaveActiveBitXor(uint4(1,2,3,4));
if (TID.x % 2)
Out5[TID.x + 4 * 2] = WaveActiveBitXor(uint4(1,2,3,4));
if (TID.x == 1)
Out5[4 * 3] = WaveActiveBitXor(uint4(1,2,3,4));

Which should result in:

    Data: [
        0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, // 4 threads (0,1,2,3)
        0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
        0x1, 0x2, 0x3, 0x4, 0x0, 0x0, 0x0, 0x0, // 3 threads (0,2,3)
        0x1, 0x2, 0x3, 0x4, 0x1, 0x2, 0x3, 0x4,
        0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, // 2 threads (0,2)
        0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
        0x1, 0x2, 0x3, 0x4 // 1 thread (1)
    ]

It would also be more interesting if you did something like the 3-thread case with the non-constant inputs (it could be just scalar at that point).

Copy link
Copy Markdown
Contributor

@bogner bogner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tex's suggestions for slightly better test coverage are good, otherwise LGTM

@bob80905 bob80905 added the test-all When applied to a PR this will opt-in to additional pre-merge test configurations.. label Apr 1, 2026
@bob80905 bob80905 merged commit 28e49e8 into llvm:main Apr 3, 2026
16 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test-all When applied to a PR this will opt-in to additional pre-merge test configurations..

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add test for WaveActiveBitXor

3 participants