Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how IEEE-754 floating point numbers work

Let's say I have this:

float i = 1.5

in binary, this float is represented as:

0 01111111 10000000000000000000000

I broke up the binary to represent the 'signed', 'exponent' and 'fraction' chunks.

What I don't understand is how this represents 1.5.

The exponent is 0 once you subtract the bias (127 - 127), and the fraction part with the implicit leading one is 1.1.

How does 1.1 scaled by nothing = 1.5???

like image 563
Tony Stark Avatar asked Apr 25 '10 01:04

Tony Stark


People also ask

How do floating-point values work?

A Floating Point number usually has a decimal point. This means that 0, 3.14, 6.5, and -125.5 are Floating Point numbers. Since Floating Point numbers represent a wide variety of numbers their precision varies.

What is the IEEE-754 standard for floating-point representation?

The IEEE-754 standard describes floating-point formats, a way to represent real numbers in hardware. There are at least five internal formats for floating-point numbers that are representable in hardware targeted by the MSVC compiler. The compiler only uses two of them.

What are the key characteristics of the IEEE 754 notation for representing floating-point numbers?

The IEEE 754 standard specifies two precisions for floating-point numbers. Single precision numbers have 32 bits − 1 for the sign, 8 for the exponent, and 23 for the significand. The significand also includes an implied 1 to the left of its radix point.


2 Answers

Think first in terms of decimal (base 10): 643.72 is:

  • (6 * 102) +
  • (4 * 101) +
  • (3 * 100) +
  • (7 * 10-1) +
  • (2 * 10-2)

or 600 + 40 + 3 + 7/10 + 2/100.

That's because n0 is always 1, n-1 is the same as 1/n (for a specific case) and n-m is identical to 1/nm (for more general case).

Similarly, the binary number 1.1 is:

  • (1 * 20) +
  • (1 * 2-1)

with 20 being one and 2-1 being one-half.

In decimal, the numbers to the left of the decimal point have multipliers 1, 10, 100 and so on heading left from the decimal point, and 1/10, 1/100, 1/1000 heading right (i.e., 102, 101, 100, decimal point, 10-1, 10-2, ...).

In base-2, the numbers to the left of the binary point have multipliers 1, 2, 4, 8, 16 and so on heading left. The numbers to the right have multipliers 1/2, 1/4, 1/8 and so on heading right.

So, for example, the binary number:

101.00101
| |   | |
| |   | +- 1/32
| |   +---  1/8
| +-------    1
+---------    4

is equivalent to:

4 + 1 + 1/8 + 1/32

or:

    5
5  --
   32
like image 110
paxdiablo Avatar answered Dec 17 '22 11:12

paxdiablo


1.1 in binary is 1 + .5 = 1.5

like image 42
pavpanchekha Avatar answered Dec 17 '22 11:12

pavpanchekha