How would I go about manually changing a decimal (base 10) number into IEEE 754 single-precision floating-point format? I understand that there is three parts to it, a sign, an exponent, and a mantissa. I just don't completely understand what the last two parts actually represent.
To convert a decimal number to binary floating point representation: Convert the absolute value of the decimal number to a binary integer plus a binary fraction. Normalize the number in binary scientific notation to obtain m and e. Set s=0 for a positive number and s=1 for a negative number.
All integers with 7 or fewer decimal digits, and any 2n for a whole number −149 ≤ n ≤ 127, can be converted exactly into an IEEE 754 single-precision floating-point value. In the IEEE 754-2008 standard, the 32-bit base-2 format is officially referred to as binary32; it was called single in IEEE 754-1985.
Find the largest power of 2 which is smaller than your number, e.g if you start with x = 10.0 then 23 = 8, so the exponent is 3. The exponent is biased by 127 so this means the exponent will be represented as 127 + 3 = 130. The mantissa is then 10.0/8 = 1.25. The 1 is implicit so we just need to represent 0.25, which is 010 0000 0000 0000 0000 0000 when expressed as a 23 bit unsigned fractional quantity. The sign bit is 0 for positive. So we have:
s | exp [130] | mantissa [(1).25] | 0 | 100 0001 0 | 010 0000 0000 0000 0000 0000 | 0x41200000
You can test the representation with a simple C program, e.g.
#include <stdio.h> typedef union { int i; float f; } U; int main(void) { U u; u.f = 10.0; printf("%g = %#x\n", u.f, u.i); return 0; }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With