I'm trying to optimize some matrix computations and I was wondering if it was possible to detect at compile-time if SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI[1] is enabled by the compiler ? Ideally for GCC and Clang, but I can manage with only one of them. I'm not sure it is possible and perhaps I will use my own macro, but I'd prefer detecting it rather and asking the user to select it. <hr> [1] "KCVI" stands for Knights Corner Vector Instruction optimizations. Libraries like FFTW detect/utilize these newer instruction optimizations.

Take a look at archspec, a library which was built exactly for this purpose: https://github.com/archspec/archspec

How to detect SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI availability at compile-time?

Tags:

I'm trying to optimize some matrix computations and I was wondering if it was possible to detect at compile-time if SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI^[1] is enabled by the compiler ? Ideally for GCC and Clang, but I can manage with only one of them.

I'm not sure it is possible and perhaps I will use my own macro, but I'd prefer detecting it rather and asking the user to select it.

^[1] "KCVI" stands for Knights Corner Vector Instruction optimizations. Libraries like FFTW detect/utilize these newer instruction optimizations.

801

asked Mar 09 '15 10:03

Baptiste Wicht

2 Answers

Most compilers will automatically define:

__SSE__ __SSE2__ __SSE3__ __AVX__ __AVX2__

etc, according to whatever command line switches you are passing. You can easily check this with gcc (or gcc-compatible compilers such as clang), like this:

$ gcc -msse3 -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE_MATH__ 1

or:

$ gcc -mavx2 -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __AVX__ 1 #define __AVX2__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1

or to just check the pre-defined macros for a default build on your particular platform:

$ gcc -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __SSE2_MATH__ 1 #define __SSE2__ 1 #define __SSE3__ 1 #define __SSE_MATH__ 1 #define __SSE__ 1 #define __SSSE3__ 1

More recent Intel processors support AVX-512, which is not a monolithic instruction set. One can see the support available from GCC (version 6.2) for two examples below.

Here is Knights Landing:

$ gcc -march=knl -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __AVX__ 1 #define __AVX2__ 1 #define __AVX512CD__ 1 #define __AVX512ER__ 1 #define __AVX512F__ 1 #define __AVX512PF__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1

Here is Skylake AVX-512:

$ gcc -march=skylake-avx512 -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __AVX__ 1 #define __AVX2__ 1 #define __AVX512BW__ 1 #define __AVX512CD__ 1 #define __AVX512DQ__ 1 #define __AVX512F__ 1 #define __AVX512VL__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1

Intel has disclosed additional AVX-512 subsets (see ISA extensions). GCC (version 7) supports compiler flags and preprocessor symbols associated with the 4FMAPS, 4VNNIW, IFMA, VBMI and VPOPCNTDQ subsets of AVX-512:

for i in 4fmaps 4vnniw ifma vbmi vpopcntdq ; do echo "==== $i ====" ; gcc -mavx512$i -dM -E - < /dev/null | egrep "AVX512" | sort ; done ==== 4fmaps ==== #define __AVX5124FMAPS__ 1 #define __AVX512F__ 1 ==== 4vnniw ==== #define __AVX5124VNNIW__ 1 #define __AVX512F__ 1 ==== ifma ==== #define __AVX512F__ 1 #define __AVX512IFMA__ 1 ==== vbmi ==== #define __AVX512BW__ 1 #define __AVX512F__ 1 #define __AVX512VBMI__ 1 ==== vpopcntdq ==== #define __AVX512F__ 1 #define __AVX512VPOPCNTDQ__ 1

Note that the SSE macros won't work with Visual C++. You have to use _M_IX86_FP instead.

145

answered Sep 18 '22 07:09

Paul R

Take a look at archspec, a library which was built exactly for this purpose: https://github.com/archspec/archspec

answered Sep 18 '22 07:09

Kenneth Hoste

Related questions
                            
                                Return void type in C and C++
                            
                                Embedding resources in executable using GCC
                            
                                How to recompile with -fPIC
                            
                                How do I force gcc to inline a function?
                            
                                How to include header files in GCC search path?
                            
                                What exactly does `-rdynamic` do and when exactly is it needed?
                            
                                When is it necessary to use the flag -stdlib=libstdc++?
                            
                                How to update GCC in MinGW on Windows?
                            
                                Strange code that compiles with g++
                            
                                What is the performance penalty of C++11 thread_local variables in GCC 4.8?
                            
                                Compiling without libc
                            
                                ARM compilation error, VFP registers used by executable, not object file
                            
                                How to specify non-default shared-library path in GCC Linux? Getting "error while loading shared libraries" when running
                            
                                gcc/g++ option to place all object files into separate directory
                            
                                Detect gcc as opposed to msvc / clang with macro
                            
                                Floating point exception ( SIGFPE ) on 'int main(){ return(0); }'
                            
                                Force gcc to compile 32 bit programs on 64 bit platform
                            
                                Selectively remove a warning message using GCC
                            
                                How to set the LDFLAGS in CMakeLists.txt?
                            
                                Is the compiler allowed to optimize out heap memory allocations?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to detect SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI availability at compile-time?

Tags:

gcc

avx

clang

sse

avx512

Baptiste Wicht

People also ask

2 Answers

Paul R

Kenneth Hoste

Recent Activity

Donate For Us