Do not use AVX2 instructions if the CPU doesn't support it#67
Closed
edwintorok wants to merge 2 commits intoOpenVisualCloud:masterfrom
Closed
Do not use AVX2 instructions if the CPU doesn't support it#67edwintorok wants to merge 2 commits intoOpenVisualCloud:masterfrom
edwintorok wants to merge 2 commits intoOpenVisualCloud:masterfrom
Conversation
Otherwise we crash with SIGILL on a CPU without AVX (e.g. AMD Opteron 6172). Signed-off-by: Edwin Török <edvin.torok@citrix.com>
To avoid a SIGILL on non-AVX2 CPU use the correct codepath. `ASM_TYPES & PREAVX2_MASK` is always `1`, which chooses the avx2 path and crashes. Use `ASM_TYPES & AVX2_MASK` which should be `0` if the CPU doesn't support AVX2, choosing the non-AVX2 path and avoiding a SIGILL. Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Author
|
Is there a test suite I could run to check whether it also works correctly? |
Contributor
|
Hi @edwintorok, thank you for this PR. |
Author
|
The appveyor CI error seems unrelated: |
Contributor
|
Hi @edwintorok, yes, it is not related. |
Contributor
|
Hi @edwintorok, fyi, before merging this PR, there is some effort to evaluate performance on legacy CPU. With the scarcity of legacy HW, it will take longer for us to move forward with this PR. |
Author
|
Closing this as obsolete, my AMD CPU has AVX2 too now. |
MartinPulec
added a commit
to MartinPulec/UltraGrid
that referenced
this pull request
Mar 11, 2026
It doesn't seem that libsvt_vp9 is better even for AVX2+ processors. From testing with i9-9820X and 9950X with `testcard2:noise:fps=200` we got 60 FPS for SVT and 72 for VPX in case of Intel and 153 and 128 for AMD. In the SVT case, the frame was significatly (250 framees) delaysed with default setting. Also, while libsvt_vp9 supports just yuv420p, libvpx-vp9 offers up to 12 bit 4:4:4 subsampled both YUV and RGB pixel formats. For the record, there has been a [patch](OpenVisualCloud/SVT-VP9#67) for libsvt_vp9 to support pre-AVX2, but it doesn't seem to make much sense to use.
MartinPulec
added a commit
to MartinPulec/UltraGrid
that referenced
this pull request
Mar 11, 2026
It doesn't seem that libsvt_vp9 is better even for AVX2+ processors. From testing with i9-9820X and 9950X with `testcard2:noise:fps=200` we got 60 FPS for SVT and 72 for VPX in case of Intel and 153 and 128 for AMD. In the SVT case, the frame was significatly (250 framees) delaysed with default setting. Also, while libsvt_vp9 supports just yuv420p, libvpx-vp9 offers up to 12 bit 4:4:4 subsampled both YUV and RGB pixel formats. For the record, there has been a [patch](OpenVisualCloud/SVT-VP9#67) for libsvt_vp9 to support pre-AVX2, but it doesn't seem to make much sense to use.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The code for checking CPUID and alternate non-avx2 implementations were all mostly in place, except:
xgetbvwas used before checking CPUID, which caused a SIGILLFixes #9
This code is part of the Phoronix test suite, and I noticed that it crashed on AMD Opteron Processor 6172, now it no longer crashes:
Note: other SVT projects suffer from same problem: