Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why there is loss of value when converting from int to float in the below code?

int value1 = 123456789;
float value2 = value1;
System.out.println(value1);
System.out.println(value2);

Output:

123456789
123456792

like image 631
pramod kumar Avatar asked Aug 09 '15 10:08

pramod kumar


1 Answers

The float type uses the same number of bits as int (32 bits) to represent floating point numbers in the larger range than int uses to represent only integers.

This causes a loss of precision, since not every int number can be represented accurately by a float. Only 24 bits are used to represent the fraction part of the number (including the sign bit), while the other 8 are used to represent the exponent.

If you assign this int value to a double, there won't be any loss of precision, since double has 64 bits, and more than 32 of them are used to represent the fraction.

Here's a more detailed explanation:

The binary representation of 123456789 as an int is :

00000111 01011011 11001101 0001 0101

A single precision floating point number is constructed from its 32 bits using the following formula :

(-1)^sign * 1.b22 b21 ... b0 * 2^(e-127)

Where sign is the left most bit (b31). b22 to b0 are the fraction bits, and bits b30 to b23 make the exponent e.

Therefore, when you convert the int 123456789 to float, you can only use the following 25 bits :

00000111 01011011 11001101 00010101
-    --- -------- -------- -----

We can safely get rid of any leading zeroes (except of the sign bit) and any trailing zeroes. This leaves you with the 3 least significant bits, which we must drop. We can either subtract 5 to get 123456784:

00000111 01011011 11001101 00010000
-    --- -------- -------- -----

or add 3 to get 123456792:

00000111 01011011 11001101 00011000
-    --- -------- -------- -----

Obviously adding 3 gives a better approximation.

like image 93
Eran Avatar answered Oct 09 '22 10:10

Eran