Can I make my compiler use fast-math on a per-function basis?

Tags:

Suppose I have

template <bool UsesFastMath> void foo(float* data, size_t length);

and I want to compile one instantiation with -ffast-math (--use-fast-math for nvcc), and the other instantiation without it.

This can be achieved by instantiating each of the variants in a separate translation unit, and compiling each of them with a different command-line - with and without the switch.

My question is whether it's possible to indicate to popular compilers (*) to apply or not apply -ffast-math for individual functions - so that I'll be able to have my instantiations in the same translation unit.

Notes:

If the answer is "no", bonus points for explaining why not.
This is not the same questions as this one, which is about turning fast-math on and off at runtime. I'm much more modest...

(*) by popular compilers I mean any of: gcc, clang, msvc icc, nvcc (for GPU kernel code) about which you have that information.

574

asked Nov 19 '16 23:11

einpoklum

2 Answers

In GCC you can declare functions like following:

__attribute__((optimize("-ffast-math")))
double
myfunc(double val)
{
    return val / 2;
}

This is GCC-only feature.

See working example here -> https://gcc.gnu.org/ml/gcc/2009-10/msg00385.html

It seems that GCC not verifies optimize() arguments. So typos like "-ffast-match" will be silently ignored.

112

answered Sep 30 '22 14:09

user2743554

As of CUDA 7.5 (the latest version I am familiar with, although CUDA 8.0 is currently shipping), nvcc does not support function attributes that allow programmers to apply specific compiler optimizations on a per-function basis.

Since optimization configurations set via command line switches apply to the entire compilation unit, one possible approach is to use as many different compilation units as there are different optimization configurations, as already noted in the question; source code may be shared and #include-ed from a common file.

With nvcc, the command line switch --use_fast_math basically controls three areas of functionality:

Flush-to-zero mode is enabled (that is, denormal support is disabled)
Single-precision reciprocal, division, and square root are switched to approximate versions
Certain standard math functions are replaced by equivalent, lower-precision, intrinsics

You can apply some of these changes with per-operation granularity by using appropriate intrinsics, others by using PTX inline assembly.

answered Sep 30 '22 14:09

njuffa

Related questions
                            
                                Why doesn't this code generate an error on using a variable array size?
                            
                                How to add in a CMake project a global file extension (*.pde) to GCC which is treated like C++ code
                            
                                error: '_hypot' was not declared in this scope
                            
                                Weird Behavior with gcc precompiled headers
                            
                                Does C++11 force move unconditionally?
                            
                                Why is the address of argc different at each run of program?
                            
                                Why GCC does not report uninitialized variable?
                            
                                default value of a unique_ptr
                            
                                What does %c mean in GCC inline assembly code?
                            
                                /Developer/Platforms/iPhoneSimulator.platform/Developer/usr/bin/gcc-4.2 failed with exit code 1 error [duplicate]
                            
                                Error when compiling Ruby 1.8.7 from source: math.c:37: error: missing binary operator before token "("
                            
                                Converting the GNU case range extension to standard C
                            
                                how to use stdin pipe as source input for gcc?
                            
                                OS X 10.8, llvm, OpenMP with CMake
                            
                                How to run SFML in CLion, Error undefined reference to?
                            
                                How to make GCC not generate .h.gch files
                            
                                GCC error "<variable> causes a section type conflict"
                            
                                anaconda python error importing theano
                            
                                Why is stddef.h not in /usr/include?
                            
                                What's the differences between -m32, -m64, and nothing in gcc's options?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can I make my compiler use fast-math on a per-function basis?

Tags:

floating-point

gcc

fast-math

nvcc

template-instantiation

einpoklum

People also ask

2 Answers

user2743554

njuffa

Recent Activity

Donate For Us