I am planning to implement runtime detection of SIMD extensions. Is it such that if I find out that the processor has AVX2 support, it is also guaranteed to have SSE4.2 and AVX support?
AVX2 implies AVX, and AVX implies that the VEX encoding of SSE4. 2 instructions like vpcmpistri are available.
CPUs with AVX Generally, CPUs with the commercial denomination Core i3/i5/i7/i9 support them, whereas Pentium and Celeron CPUs before Tiger Lake do not.
Intel has had SSE 4.2 for about 12 years now as it was introduced in their Nehalem architecture, AMD has had it for about 9 years. Basically every Intel Core i3, i5, i7, and AMD FX and Ryzen CPU supports this as well. The i7-10700 requires an LGA1200 motherboard, I would recommend an H470 motherboard for your use case.
Support for a more-recent Intel SIMD ISA extension implies support for previous SIMD ones.
AVX2 definitely implies AVX1.
I think AVX1 implies all of SSE/SSE2/SSE3/SSSE3/SSE4.1/SSE4.2 feature bits must also be set in CPUID. If not formally guaranteed, many things make this assumption and a CPU that violated it would probably not be commercially viable for general use.
Note that popcnt
has its own feature bit, so in theory you could have a CPU with AVX2 and SSE4.2, but not popcnt
, but many things treat SSE4.2 as implying popcnt
. So it's more like you can advertize support for popcnt without SSE4.2.
In theory you could make a CPU (or virtual machine) with AVX but which didn't accept the non-VEX legacy-SSE encoding of SSE4.2 instructions like pcmpistri
, but I think you'd be violating Intel's guarantees about what the AVX feature bit implies. Not sure if that's formally written down in a manual, but most software will assume that.
But AVX1 does imply support for the VEX encoding of all SSE4.2 and earlier SIMD instructions, e.g. vpcmpistri
or vminss
gcc -mavx2
definitely implies AVX1 and previous extensions, but will only emit code that uses the VEX encoding. It will define the __SSE4_2__
macro and so on, though, so gcc does treat AVX2 as implying earlier SSE extensions and popcnt, but not FMA, AES-NI or PCLMUL. Those are separate features even for GCC.
(In practice you should use gcc -march=native
or gcc -march=znver1
or whatever to enable all the features your CPU has, and set tuning options for it. Not just -mavx2 -mfma
, that leaves tuning settings at bad defaults like splitting every possibly-unaligned 256-bit load/store into 128-bit halves.)
(Note that MSVC doesn't have as many SIMD ISA detection macros; it has one for AVX but not for all of the earlier SSE* extensions. MSVC's model is designed around the assumption that programs will do runtime CPU detection instead of being compiled for the local machine. Although MSVC does now have AVX and AVX2 options to use those as baselines.)
Note that AVX512 kind of breaks the traditions. AVX512F implies support for AVX2 and everything before it, but beyond that AVX512DQ doesn't come "before" or "after" AVX512ER, for example. You can (in theory) have either, both, or neither. (In practice, Skylake-X/Cannonlake/etc. has only a bit of overlap with Xeon Phi (Knight's Landing / Knight's Mill), beyond AVX512F. https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With