Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Precision of Floats to Ints vs Doubles to Ints, unexpected results

Tags:

c++

casting

I am computer engineering student and do tutoring for the introductory C++ classes at BYU-Idaho, and a student successfully stumped me.

If write the code for this:

#include <iostream>
using namespace std;
int main()
{
   float y = .59;
   int x = (int)(y * 100.0);
   cout << x << endl;
   return 0;
}

Result = 58

#include <iostream>
using namespace std;
int main()
{
   double y = .59;
   int x = (int)(y * 100.0);
   cout << x << endl;
   return 0;
}

Result = 59

I told him it was a precision issue and that because the int is more precise than a float it loses information. A double is more precise than a float so it works.

However I am not sure if what I said is correct. I think it has something to do with the int getting padded with zeros and as a result it gets "truncated" while it get's casted, but I am not sure.

If any of you guys want to explain what is going on "underneath" all of this I would find it interesting!

like image 851
njfife Avatar asked Dec 20 '22 10:12

njfife


2 Answers

The problem is that float isn't accurate enough to hold the exact value 0.59. If you store such a value, it will be rounded in binary representation (already during compile time) to something different, in your case this was a value slightly less than 0.59 (it might also be slightly greater than the value you wanted it to be). When multiplying this with 100, you get a value slightly less than 59. Converting such a value to an integer will round it towards 0, so this leads to 58.

0.59 as a float will be stored as (now being represented as a human-readable decimal number):

0.589999973773956298828125

Now to the double type. While this type has essentially the same problem, it might be of two reasons why you get the expected result: Either double can hold the exact value you want (this is not the case with 0.59 but for other values it might be the case), or the compiler decides to round it up. Thus, multiplying this with 100 leads to a value which is not less than 59 and will be rounded towards 0 to 59, as expected.

Now note that it might be the case that 0.59 as a double is still being rounded down by the compiler. Indeed, I just checked and it is. 0.59 as a double will be stored as:

0.58999999999999996891375531049561686813831329345703

However, you are multiplying this value with 100 before converting it to an integer. Now there comes an interesting point: When multiplied with 100, the difference of y to 0.59 put by the compiler is eliminated since 0.59 * 100 can again not be stored exactly. In fact, the processor calculates 0.58999999999999996891375531049561686813831329345703 * 100.0, which will be rounded up to 59, a number which can be represented in double!

See this code for details: http://ideone.com/V0essb

Now you might wonder why the same doesn't count for float, which should behave exactly the same but with different accuracy. The problem is that 0.589999973773956298828125 * 100.0 is not rounded up to 59 (which can also be represented in a float). The rounding behavior after calculations isn't really defined.

Indeed, operations on floating point numbers aren't exactly specified, meaning that you can encounter different results on different machines. This makes it possible to implement performance tweaks which lead to slightly incorrect results, even if rounding isn't involved! It might be the case that on another machine you end up with the expected results while on others you are not.

like image 103
leemes Avatar answered Dec 24 '22 02:12

leemes


0.59 is not exactly representable in binary floating-point. So x will actually be a value very slightly above or below 0.59. Which it is may be affected by whether you use float or double. This in turn will determine whether the result of your program is 58 or 59.

like image 28
Oliver Charlesworth Avatar answered Dec 24 '22 01:12

Oliver Charlesworth