I have some code written that uses AVX intrinsics when they are available on the current CPU. In GCC and Clang, unlike Visual C++, in order to use intrinsics, you must enable them on the command line. The problem with GCC and Clang is that when you enable these options, you're giving the compiler free reign to use those instructions everywhere in your source file. This is very bad when you have header files containing inline functions or template functions, because the compiler will generate these functions with AVX instructions. When linking, duplicate functions will be discarded. However, because some source files were compiled with <code>-mavx</code> and some were not, the various compilations of the inline/template functions will be different. If you're unlucky, the linker will randomly choose the version that has AVX instructions, causing the program to crash when run on a system without AVX. GCC solves this with <code>#pragma GCC target</code>. You can turn off the special instructions for the header files, and the code generated will not use AVX: <pre class="prettyprint"><code>#pragma GCC push_options #pragma GCC target("no-avx") #include "MyHeader.h" #pragma GCC pop_options </code></pre> Does Clang have anything like this? It seems to ignore these options and generates AVX code anyway.

The Clang equivalent to <code>GCC push_options / GCC target / GCC pop_options</code> are the <code>clang attribute push / clang attribute pop</code> pragmas along with the <code>target</code> attribute: <pre class="prettyprint"><code>#pragma clang attribute push (__attribute__((target("pclmul,sse4.1,ssse3"))), apply_to=function) // ... #pragma clang attribute pop </code></pre> This is the equivalent of: <pre class="prettyprint"><code>#pragma GCC push_options #pragma GCC target("pclmul", "sse4.1", "ssse3") // ... #pragma GCC pop_options </code></pre> Note that where the GCC <code>target</code> pragma takes a comma-delimited list of target options, the clang <code>target</code> attribute takes a single string, comma-delimited internally. Clang supports negative target options (such as <code>"no-avx"</code>), but I prefer to use positive options to add to the feature set selected by command line options.

You should probably be using <code>static inline</code> instead of <code>inline</code>, so a version of a function compiled with <code>-mavx</code> will only be used by callers from that translation unit. The linker will still merge actual duplicates, instead of just picking one non-inline definition by name. This also has the advantage that the compiler doesn't waste time emitting a stand-alone definition for functions that it decides to inline into every caller in that translation unit. <hr> The gcc/clang way makes sense if you're used to it and design your code for it. And note that MSVC need AVX enabled if you're compiling functions that use AVX. Otherwise it will mix VEX and non-VEX encodings, leading to big penalties, instead of using the VEX encoding for something like a 128-bit <code>_mm_add_ps</code> in a horizontal add at the end of a <code>_mm256_add_ps</code> loop. So you basically have the same problem with MSVC, that compiling <code>_mm_whatever</code> will make AVX-only machine code.

Does Clang have something like #pragma GCC target?

Tags:

avx

clang

pragma

intrinsics

I have some code written that uses AVX intrinsics when they are available on the current CPU. In GCC and Clang, unlike Visual C++, in order to use intrinsics, you must enable them on the command line.

The problem with GCC and Clang is that when you enable these options, you're giving the compiler free reign to use those instructions everywhere in your source file. This is very bad when you have header files containing inline functions or template functions, because the compiler will generate these functions with AVX instructions.

When linking, duplicate functions will be discarded. However, because some source files were compiled with -mavx and some were not, the various compilations of the inline/template functions will be different. If you're unlucky, the linker will randomly choose the version that has AVX instructions, causing the program to crash when run on a system without AVX.

GCC solves this with #pragma GCC target. You can turn off the special instructions for the header files, and the code generated will not use AVX:

#pragma GCC push_options
#pragma GCC target("no-avx")

#include "MyHeader.h"

#pragma GCC pop_options

Does Clang have anything like this? It seems to ignore these options and generates AVX code anyway.

886

asked Sep 11 '17 23:09

Myria

2 Answers

The Clang equivalent to GCC push_options / GCC target / GCC pop_options are the clang attribute push / clang attribute pop pragmas along with the target attribute:

#pragma clang attribute push (__attribute__((target("pclmul,sse4.1,ssse3"))), apply_to=function)
// ...
#pragma clang attribute pop

This is the equivalent of:

#pragma GCC push_options
#pragma GCC target("pclmul", "sse4.1", "ssse3")
// ...
#pragma GCC pop_options

Note that where the GCC target pragma takes a comma-delimited list of target options, the clang target attribute takes a single string, comma-delimited internally.

Clang supports negative target options (such as "no-avx"), but I prefer to use positive options to add to the feature set selected by command line options.

answered Sep 20 '22 10:09

ecatmur

You should probably be using static inline instead of inline, so a version of a function compiled with -mavx will only be used by callers from that translation unit.

The linker will still merge actual duplicates, instead of just picking one non-inline definition by name.

This also has the advantage that the compiler doesn't waste time emitting a stand-alone definition for functions that it decides to inline into every caller in that translation unit.

The gcc/clang way makes sense if you're used to it and design your code for it. And note that MSVC need AVX enabled if you're compiling functions that use AVX. Otherwise it will mix VEX and non-VEX encodings, leading to big penalties, instead of using the VEX encoding for something like a 128-bit _mm_add_ps in a horizontal add at the end of a _mm256_add_ps loop.

So you basically have the same problem with MSVC, that compiling _mm_whatever will make AVX-only machine code.

answered Sep 17 '22 10:09

Peter Cordes

Related questions
                            
                                Is substitution performed on a variadic parameter pack type if the pack is empty?
                            
                                What does -fheinous-gnu-extensions option do?
                            
                                build android with clang instead of gcc ? and the clang stl lib instead of gnustl lib?
                            
                                Why do common C compilers include the source filename in the output?
                            
                                Is there any way to know which headers are automatically included in C++
                            
                                Hint the C compiler (GCC or Clang) of possible variable value/range [duplicate]
                            
                                Why constexpr must be static?
                            
                                Clang compilation works while gcc doesn't for diamond inheritance
                            
                                What is clang's 'range-loop-analysis' diagnostic about?
                            
                                How to switch between GCC and Clang in Clion from within CMakeLists.txt using windows/cygwin
                            
                                Set default host compiler for nvcc
                            
                                clang: Force loop unroll for specific loop
                            
                                Need to change include path for clang
                            
                                Does LLVM/Clang support the 'weak' attribute for weak linking?
                            
                                how to static link with clang libc++
                            
                                Why does this simple NSWindow creation code trigger an autorelease pool crash on shutdown under ARC?
                            
                                Does the C++ standard specify STL implementation details for the compiler?
                            
                                Why is "Enable Address Sanitizer" disabled in Xcode 7?
                            
                                technical legality of incompatible pointer assignments
                            
                                Builtins in Clang not so builtin?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With