Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the proper way to use different versions of SSE intrinsics in GCC?

I will ask my question by giving an example. Now I have a function called do_something().

It has three versions: do_something(), do_something_sse3(), and do_something_sse4(). When my program runs, it will detect the CPU feature (see if it supports SSE3 or SSE4) and call one of the three versions accordingly.

The problem is: When I build my program with GCC, I have to set -msse4 for do_something_sse4() to compile (e.g. for the header file <smmintrin.h> to be included).

However, if I set -msse4, then gcc is allowed to use SSE4 instructions, and some intrinsics in do_something_sse3() is also translated to some SSE4 instructions. So if my program runs on CPU that has only SSE3 (but no SSE4) support, it causes "illegal instruction" when calls do_something_sse3().

Maybe I have some bad practice. Could you give some suggestions? Thanks.

like image 671
shengbinmeng Avatar asked Mar 23 '13 08:03

shengbinmeng


People also ask

What is GCC intrinsics?

Compiler intrinsics (sometimes called "builtins") are like the library functions you're used to, except they're built in to the compiler. They may be faster than regular library functions (the compiler knows more about them so it can optimize better) or handle a smaller input range than the library functions.

How do I specify architecture in GCC?

If gcc -v shows GCC was configured with a --with-arch option (or --with-arch-32 and/or --with-arch-64 ) then that's what will be the default. Without a --with-arch option (and if there isn't a custom specs file in use) then the arch used will be the default for the target.

Does GCC use SIMD?

The GNU Compiler Collection, gcc, offers multiple ways to perform SIMD calculations.

What does March Native mean?

If compiling packages on one computer in order to run them on a different computer (such as when using a fast computer to build for an older, slower machine), then do not use -march=native . "Native" means that the code produced will run only on that type of CPU.


1 Answers

I think that the Mystical's tip is fine, but if you really want to do it in the one file, you can use proper pragmas, for instance:

#pragma GCC target("sse4.1")

GCC 4.4 is needed, AFAIR.

like image 11
konrad.kruczynski Avatar answered Oct 19 '22 14:10

konrad.kruczynski