With the GCC compiler, the -ftree-vectorize
option turns on auto-vectorization, and this flag is automatically set when using -O3
. To what level does it vectorize? I.e., will I get SSE2, SSE4.2, AVX, or AVX2 instructions? I know of the existence of the mavx
, mavx2
flags, etc., but I want to know what the compiler is doing without those specific flags to force a particular type of vectorization.
When you invoke GCC, it normally does preprocessing, compilation, assembly and linking. The "overall options" allow you to stop this process at an intermediate stage. For example, the -c option says not to run the linker. Then the output consists of object files output by the assembler.
Compilers options (− x on Linux, and /Qx on Microsoft Windows) control which instructions the compiler uses within a function, while the processor(…) clause controls creation of non-standard functions using wider registers (YMM or ZMM) for passing SIMD data for parameters and results.
fPIC option in GCC enables the address of shared libraries to be relative so that the executable is independent of the position of libraries. This enables one to share built library which has dependencies on other shared libraries. fPIC stands for "force Position Independent Code".
It tells GCC to stop after the preprocessing stage. Details in the link.
All x86 64-bit processors have at least SSE2. The GCC compiler will default to SSE2 code in 64-bit mode unless you tell it to use other hardware options.
For 32-bit mode GCC may use x87 instructions which are not SIMD instructions so to enable vectorization make sure to enable at least SSE with -mfpmath=sse -msse2
.
If you enable higher SIMD options then the compiler may (and in many cases will) use those new instructions when vectorizing.
I believe this is true as well with Clang. However, ICC and MSVC do things differently. ICC may create a CPU dispatcher to select the best hardware (or to veto AMD hardware). MSVC only has options for enabling AVX and AVX2 in 64-bit mode (SSE2 is assumed). There is no way to explicitly enable e.g. SSE4.1 with MSVC. Instead in some cases the auto-vectorizer will add code to check for SSE4.1 (but not AVX) and use those instructions. GCC will only use SSE4.1 if you tell it to e.g with -msse4.1
or something higher such as -mavx
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With