I was trying to understand the floating point representation in C using this code (both float
and int
are 4 bytes on my machine):
int x = 3;
float y = *(float*) &x;
printf("%d %e \n", x, y);
We know that the binary representation of x will be the following
00000000000000000000000000000011
Therefore I would have expected y to be represented as follows
Sign bit (first bit from left) = 0
Exponent (bits 2-9 from left) = 0
Mantissa (bits 10-32): 1 + 2^(-22)+2^(-23)
Leading to y = (-1)^0 * 2^(0-127) * (1+2^(-22) + 2^(-23)) = 5.87747E-39
My program however prints out
3 4.203895e-45
That is, y has the value 4.203895e-45
instead of 5.87747E-39
as I expected. Why does this happen. What am I doing wrong?
P.S. I have also printed the values directly from gdb so it is not a problem with the printf command.
Floating-point representation is similar in concept to scientific notation. Logically, a floating-point number consists of: A signed (meaning positive or negative) digit string of a given length in a given base (or radix). This digit string is referred to as the significand, mantissa, or coefficient.
You can define a variable as a float and assign a value to it in a single declaration. For example: float age = 10.5; In this example, the variable named age would be defined as a float and assigned the value of 10.5.
Scalars of type float are stored using four bytes (32-bits). The format used follows the IEEE-754 standard. The mantissa represents the actual binary digits of the floating-point number. The power of two is represented by the exponent.
'%f': Print a floating-point number in normal (fixed-point) notation. See Floating-Point Conversions, for details.
IEEE floating point numbers with exponent fields of all 0 are 'denormalized'. This means that the implicit 1 in front of the mantissa no longer is active. This allows really small numbers to be represented. See This wikipedia article for more explanation. In your example the result would be 3 * 2^-149
-127 in the exponent is reserved for denormalised numbers. Your calculation is for normalized numbers while your float is a denormalised float.
Denormalised numbers are calculated using a similar method, but:
So this means the calculation is instead:
(-1)**0*2**(-126)*(2**(-22)+2**(-23)) = 4.2038953929744512e-45
The above is python, where **
means the same as ^
In details it is described http://en.wikipedia.org/wiki/IEEE_754-2008 This standard assumed that you shifting left mantissa until hiding first meaning bit (increasing exponent). In your case yo have expression 1+2^(-23) - then you get correct answer 4.9..E-32
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With