Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

-ftree-vectorize option in GNU

With the GCC compiler, the -ftree-vectorize option turns on auto-vectorization, and this flag is automatically set when using -O3. To what level does it vectorize? I.e., will I get SSE2, SSE4.2, AVX, or AVX2 instructions? I know of the existence of the mavx, mavx2 flags, etc., but I want to know what the compiler is doing without those specific flags to force a particular type of vectorization.

like image 925
R_Kapp Avatar asked Nov 06 '15 15:11

R_Kapp


People also ask

What is option in GCC?

When you invoke GCC, it normally does preprocessing, compilation, assembly and linking. The "overall options" allow you to stop this process at an intermediate stage. For example, the -c option says not to run the linker. Then the output consists of object files output by the assembler.

What is option in compiler?

Compilers options (− x on Linux, and /Qx on Microsoft Windows) control which instructions the compiler uses within a function, while the processor(…) clause controls creation of non-standard functions using wider registers (YMM or ZMM) for passing SIMD data for parameters and results.

What is fPIC option in GCC?

fPIC option in GCC enables the address of shared libraries to be relative so that the executable is independent of the position of libraries. This enables one to share built library which has dependencies on other shared libraries. fPIC stands for "force Position Independent Code".

What does E flag do in GCC?

It tells GCC to stop after the preprocessing stage. Details in the link.


1 Answers

All x86 64-bit processors have at least SSE2. The GCC compiler will default to SSE2 code in 64-bit mode unless you tell it to use other hardware options.

For 32-bit mode GCC may use x87 instructions which are not SIMD instructions so to enable vectorization make sure to enable at least SSE with -mfpmath=sse -msse2.

If you enable higher SIMD options then the compiler may (and in many cases will) use those new instructions when vectorizing.

I believe this is true as well with Clang. However, ICC and MSVC do things differently. ICC may create a CPU dispatcher to select the best hardware (or to veto AMD hardware). MSVC only has options for enabling AVX and AVX2 in 64-bit mode (SSE2 is assumed). There is no way to explicitly enable e.g. SSE4.1 with MSVC. Instead in some cases the auto-vectorizer will add code to check for SSE4.1 (but not AVX) and use those instructions. GCC will only use SSE4.1 if you tell it to e.g with -msse4.1 or something higher such as -mavx.

like image 58
Z boson Avatar answered Sep 30 '22 00:09

Z boson