I have written a program with AVX intrinsics, which works well using Ubuntu 12.4 LTS and GCC 4.6 with the following compilation line: g++ -g -Wall -mavx ProgramName.cc -o ProgramName
The problem started When i have updated the compiler up to 4.7 and 4.8.1 versions to support the 16-bit AVX2 intrinsics, which is not supported in gcc 4.6
Currently, the updated gcc version compiles both AVX and AVX2 programs properly. However, it gives me the following error when i run the program: Illegal instruction (core dumped), although it was working on gcc 4.6
My question is: what is prefect way to compile and run both AVX and AVX2 intrinsics
Comparing vector indexesVector retrieval is consistently faster on the AVX-512 instruction set than on AVX2. This is because AVX-512 supports 512-bit computation, compared to just 256-bit computation on AVX2.
AVX2 (also known as Haswell New Instructions) expands most integer commands to 256 bits and introduces new instructions.
If gcc -v shows GCC was configured with a --with-arch option (or --with-arch-32 and/or --with-arch-64 ) then that's what will be the default. Without a --with-arch option (and if there isn't a custom specs file in use) then the arch used will be the default for the target.
"Native" means that the code produced will run only on that type of CPU. The applications built with -march=native on an Intel Core CPU will not be able to run on an old Intel Atom CPU. Also available are the -mtune and -mcpu flags.
If you tell gcc to use AVX2, it will do so, regardless of whether your CPU supports them or not. That can be useful for cross-compiling or for examining gcc's code generation, but it's not particularly helpful for running programs. If your program crashes with an illegal instruction exception, it is most likely that your CPU does not support the AVX2 extension.
On i386 and x86-64 platforms (and in certain other circumstances), you can specify the gcc option -march=native
to generate code for the host machines instruction code. The compiled code might not work on another machine with fewer capabilities, but it should allow you to use all the features of your machine.
While -march=native
is a good solution for generating executables, it does not actually help much with writing code; you still need to tailor the instrinsics for the target's architecture, and writing code which can take advantage of CPU features without relying on them gets complicated. I don't know of a good C solution, but there are several C++ template frameworks available.
Upgrading to gcc 4.8 likely pulled in AVX512, so you would have needed to limit the generated instr mix to ONLY AVX2 for your machine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With