Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What range of numbers can be represented in a 16-, 32- and 64-bit IEEE-754 systems?

I know a little bit about how floating-point numbers are represented, but not enough, I'm afraid.

The general question is:

For a given precision (for my purposes, the number of accurate decimal places in base 10), what range of numbers can be represented for 16-, 32- and 64-bit IEEE-754 systems?

Specifically, I'm only interested in the range of 16-bit and 32-bit numbers accurate to +/-0.5 (the ones place) or +/- 0.0005 (the thousandths place).

like image 990
Nate Parsons Avatar asked May 16 '09 14:05

Nate Parsons


People also ask

How many numbers can be represented with IEEE 754?

So there are 2^32 - 2^25 = 4261412864 distinct normal numbers in the IEEE 754 binary32 format.

What is the range of biased exponent in 32-bit IEEE 754 format?

For single-precision floating-point, exponents in the range of -126 to + 127 are biased by adding 127 to get a value in the range 1 to 254 (0 and 255 have special meanings).

What is the range of a 32-bit float?

32-bit single precision, with an approximate range of 10 -101 to 10 90 and precision of 7 decimal digits. 64-bit double precision, with an approximate range of 10 -398 to 10 369 and precision of 16 decimal digits.


1 Answers

For a given IEEE-754 floating point number X, if

2^E <= abs(X) < 2^(E+1) 

then the distance from X to the next largest representable floating point number (epsilon) is:

epsilon = 2^(E-52)    % For a 64-bit float (double precision) epsilon = 2^(E-23)    % For a 32-bit float (single precision) epsilon = 2^(E-10)    % For a 16-bit float (half precision) 

The above equations allow us to compute the following:

  • For half precision...

    If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^10. Any larger than this and the distance between floating point numbers is greater than 0.5.

    If you want an accuracy of +/-0.0005 (about 2^-11), the maximum size that the number can be is 1. Any larger than this and the distance between floating point numbers is greater than 0.0005.

  • For single precision...

    If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^23. Any larger than this and the distance between floating point numbers is greater than 0.5.

    If you want an accuracy of +/-0.0005 (about 2^-11), the maximum size that the number can be is 2^13. Any larger than this and the distance between floating point numbers is greater than 0.0005.

  • For double precision...

    If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^52. Any larger than this and the distance between floating point numbers is greater than 0.5.

    If you want an accuracy of +/-0.0005 (about 2^-11), the maximum size that the number can be is 2^42. Any larger than this and the distance between floating point numbers is greater than 0.0005.

like image 75
gnovice Avatar answered Sep 17 '22 17:09

gnovice