Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regarding float type precision

I can't understand why this

float f = Integer.MAX_VALUE;
System.out.println(Integer.MAX_VALUE);
System.out.println((int)f);

produces the same lines,

As well as why this does

Float f2 = (float) Integer.MAX_VALUE;
System.out.println(Integer.MAX_VALUE);
System.out.println(f2.intValue());

I mean, mantissa length for floating point number is 2^23-1. How does it manage to keep max_value of integer, which is 2^31 - 1?

like image 361
dhblah Avatar asked May 15 '13 06:05

dhblah


People also ask

What is the precision for float data type?

The data type float has 24 bits of precision. This is equivalent to only about 7 decimal places. (The rest of the 32 bits are used for the sign and size of the number.) The number of places of precision for float is the same no matter what the size of the number.

What does precision mean in floating-point?

You would get a more accurate result if you were to do the calculation including ten digits to the right of the decimal point (3.1415926535). For computers, this level of accuracy is called precision, and it's measured in binary digits (bits) instead of decimal places. The more bits used, the higher the precision.

Is float single precision?

Single precision (float) gives you 23 bits of significand, 8 bits of exponent, and 1 sign bit. Double precision (double) gives you 52 bits of significand, 11 bits of exponent, and 1 sign bit.

Why float is called single precision?

So a "single precision" float would fit in one register, while a "double precision" float would require two registers.


1 Answers

How does it manage to keep max_value of integer, which is 2^31 - 1?

It actually doesn't. The value of f is 2147483648.

However, the narrowing primitive conversion from float to int clamps the value. It gets to this part:

  • Otherwise, one of the following two cases must be true:

    • The value must be too small (a negative value of large magnitude or negative infinity), and the result of the first step is the smallest representable value of type int or long.

    • The value must be too large (a positive value of large magnitude or positive infinity), and the result of the first step is the largest representable value of type int or long.

You can see this easily by making the number even bigger:

float f = Integer.MAX_VALUE;
f = f * 1000;
System.out.println(Integer.MAX_VALUE); // 2147483647
System.out.println((int)f); // 2147483647

Or by casting to long instead, which obviously doesn't need to be clamped at the same point:

float f = Integer.MAX_VALUE;
System.out.println(Integer.MAX_VALUE); // 2147483647
System.out.println((long)f); // 2147483648
like image 150
Jon Skeet Avatar answered Sep 20 '22 10:09

Jon Skeet