Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this floating-point optimization allowed?

I tried to check out where float loses the ability to exactly represent large integer numbers. So I wrote this little snippet:

int main() {     for (int i=0; ; i++) {         if ((float)i!=i) {             return i;         }     } } 

This code seems to work with all compilers, except clang. Clang generates a simple infinite loop. Godbolt.

Is this allowed? If yes, is it a QoI issue?

like image 403
geza Avatar asked Jul 12 '19 09:07

geza


People also ask

How can we avoid floating point?

1. Do not use floating-point numbers if integers will suffice. As a corollary, do not use floating-point numbers if exact computation is required, rather, scale if a fixed number of decimals will always be used. For example, if the values are dollars do the computations in cents and divide by 100 when done.

When would you use a floating point?

Floating point numbers are used to represent noninteger fractional numbers and are used in most engineering and technical calculations, for example, 3.256, 2.1, and 0.0036. The most commonly used floating point standard is the IEEE standard.

How accurate is floating point?

The data type float has 24 bits of precision. This is equivalent to only about 7 decimal places. (The rest of the 32 bits are used for the sign and size of the number.) The number of places of precision for float is the same no matter what the size of the number.

Why is floating point broken?

The main cause of the error in floating point division is the division algorithms used to calculate the quotient. Most computer systems calculate division using multiplication by an inverse, mainly in Z=X/Y , Z = X * (1/Y) .


2 Answers

Note that the built-in operator != requires its operands to be of the same type, and will achieve that using promotions and conversions if necessary. In other words, your condition is equivalent to:

(float)i != (float)i 

That should never fail, and so the code will eventually overflow i, giving your program Undefined Behaviour. Any behaviour is therefore possible.

To correctly check what you want to check, you should cast the result back to int:

if ((int)(float)i != i) 
like image 96
Angew is no longer proud of SO Avatar answered Sep 23 '22 04:09

Angew is no longer proud of SO


As @Angew pointed out, the != operator needs the same type on both sides. (float)i != i results in promotion of the RHS to float as well, so we have (float)i != (float)i.


g++ also generates an infinite loop, but it doesn't optimize away the work from inside it. You can see it converts int->float with cvtsi2ss and does ucomiss xmm0,xmm0 to compare (float)i with itself. (That was your first clue that your C++ source doesn't mean what you thought it did like @Angew's answer explains.)

x != x is only true when it's "unordered" because x was NaN. (INFINITY compares equal to itself in IEEE math, but NaN doesn't. NAN == NAN is false, NAN != NAN is true).

gcc7.4 and older correctly optimizes your code to jnp as the loop branch (https://godbolt.org/z/fyOhW1) : keep looping as long as the operands to x != x weren't NaN. (gcc8 and later also checks je to a break out of the loop, failing to optimize based on the fact that it will always be true for any non-NaN input). x86 FP compares set PF on unordered.


And BTW, that means clang's optimization is also safe: it just has to CSE (float)i != (implicit conversion to float)i as being the same, and prove that i -> float is never NaN for the possible range of int.

(Although given that this loop will hit signed-overflow UB, it's allowed to emit literally any asm it wants, including a ud2 illegal instruction, or an empty infinite loop regardless of what the loop body actually was.) But ignoring the signed-overflow UB, this optimization is still 100% legal.


GCC fails to optimize away the loop body even with -fwrapv to make signed-integer overflow well-defined (as 2's complement wraparound). https://godbolt.org/z/t9A8t_

Even enabling -fno-trapping-math doesn't help. (GCC's default is unfortunately to enable
-ftrapping-math even though GCC's implementation of it is broken/buggy.) int->float conversion can cause an FP inexact exception (for numbers too large to be represented exactly), so with exceptions possibly unmasked it's reasonable not to optimize away the loop body. (Because converting 16777217 to float could have an observable side-effect if the inexact exception is unmasked.)

But with -O3 -fwrapv -fno-trapping-math, it's 100% missed optimization not to compile this to an empty infinite loop. Without #pragma STDC FENV_ACCESS ON, the state of the sticky flags that record masked FP exceptions is not an observable side-effect of the code. No int->float conversion can result in NaN, so x != x can't be true.


These compilers are all optimizing for C++ implementations that use IEEE 754 single-precision (binary32) float and 32-bit int.

The bugfixed (int)(float)i != i loop would have UB on C++ implementations with narrow 16-bit int and/or wider float, because you'd hit signed-integer overflow UB before reaching the first integer that wasn't exactly representable as a float.

But UB under a different set of implementation-defined choices doesn't have any negative consequences when compiling for an implementation like gcc or clang with the x86-64 System V ABI.


BTW, you could statically calculate the result of this loop from FLT_RADIX and FLT_MANT_DIG, defined in <climits>. Or at least you can in theory, if float actually fits the model of an IEEE float rather than some other kind of real-number representation like a Posit / unum.

I'm not sure how much the ISO C++ standard nails down about float behaviour and whether a format that wasn't based on fixed-width exponent and significand fields would be standards compliant.


In comments:

@geza I would be interested to hear the resulting number!

@nada: it's 16777216

Are you claiming you got this loop to print / return 16777216?

Update: since that comment has been deleted, I think not. Probably the OP is just quoting the float before the first integer that can't be exactly represented as a 32-bit float. https://en.wikipedia.org/wiki/Single-precision_floating-point_format#Precision_limits_on_integer_values i.e. what they were hoping to verify with this buggy code.

The bugfixed version would of course print 16777217, the first integer that's not exactly representable, rather than the value before that.

(All the higher float values are exact integers, but they're multiples of 2, then 4, then 8, etc. for exponent values higher than the significand width. Many higher integer values can be represented, but 1 unit in the last place (of the significand) is greater than 1 so they're not contiguous integers. The largest finite float is just below 2^128, which is too large for even int64_t.)

If any compiler did exit the original loop and print that, it would be a compiler bug.

like image 32
Peter Cordes Avatar answered Sep 22 '22 04:09

Peter Cordes