What does gcc's ffast-math actually do?

Tags:

-ffast-math does a lot more than just break strict IEEE compliance.

First of all, of course, it does break strict IEEE compliance, allowing e.g. the reordering of instructions to something which is mathematically the same (ideally) but not exactly the same in floating point.

Second, it disables setting errno after single-instruction math functions, which means avoiding a write to a thread-local variable (this can make a 100% difference for those functions on some architectures).

Third, it makes the assumption that all math is finite, which means that no checks for NaN (or zero) are made in place where they would have detrimental effects. It is simply assumed that this isn't going to happen.

Fourth, it enables reciprocal approximations for division and reciprocal square root.

Further, it disables signed zero (code assumes signed zero does not exist, even if the target supports it) and rounding math, which enables among other things constant folding at compile-time.

Last, it generates code that assumes that no hardware interrupts can happen due to signalling/trapping math (that is, if these cannot be disabled on the target architecture and consequently do happen, they will not be handled).

As you mentioned, it allows optimizations that do not preserve strict IEEE compliance.

An example is this:

x = x*x*x*x*x*x*x*x;

x *= x;
x *= x;
x *= x;

Because floating-point arithmetic is not associative, the ordering and factoring of the operations will affect results due to round-off. Therefore, this optimization is not done under strict FP behavior.

I haven't actually checked to see if GCC actually does this particular optimization. But the idea is the same.

Related questions
                            
                                Multiprocessing - Pipe vs Queue
                            
                                A fast method to round a double to a 32-bit int explained
                            
                                Controlling fps with requestAnimationFrame?
                            
                                Anatomy of a "Memory Leak"
                            
                                SQL, Postgres OIDs, What are they and why are they useful?
                            
                                Fastest way to determine if record exists
                            
                                Fastest way to iterate over all the chars in a String
                            
                                jQuery hasClass() - check for more than one class
                            
                                Why is if (variable1 % variable2 == 0) inefficient?
                            
                                How big is too big for a PostgreSQL table?
                            
                                Most efficient way to concatenate strings in JavaScript?
                            
                                How to deal with a slow SecureRandom generator?
                            
                                Cached, PHP generated Thumbnails load slowly
                            
                                Is Java really slow?
                            
                                How to "return an object" in C++?
                            
                                Algorithm to calculate the number of divisors of a given number
                            
                                Android webview slow
                            
                                Check if a string contains an element from a list (of strings)
                            
                                Measure the time it takes to execute a t-sql query
                            
                                How to change the playing speed of videos in HTML5?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What does gcc's ffast-math actually do?

Tags:

performance

math

floating-point

gcc

fast-math

Recent Activity

Donate For Us