Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Floating point representations seem to do integer arithmetic correctly - why?

I've been playing around with floating point numbers a little bit, and based on what I've learned about them in the past, the fact that 0.1 + 0.2 ends up being something like 0.30000000000000004 doesn't surprise me.

What does surprise me, however, is that integer arithmetic always seems to work just fine and not have any of these artifacts.

I first noticed this in JavaScript (Chrome V8 in node.js):

0.1 + 0.2 == 0.3 // false, NOT surprising
123456789012 + 18 == 123456789030  // true
22334455667788 + 998877665544 == 23333333333332 // true
1048576 / 1024 == 1024  // true

C++ (gcc on Mac OS X) seems to have the same properties.

The net result seems to be that integer numbers just — for lack of a better word — work. It's only when I start using decimal numbers that things get wonky.

Is this is a feature of the design, an mathematical artifact, or some optimisation done by compilers and runtime environments?

like image 275
MarcWan Avatar asked Dec 16 '22 18:12

MarcWan


2 Answers

Is this is a feature of the design, an mathematical artifact, or some optimisation done by compilers and runtime environments?

It's a feature of the real numbers. A theorem from modern algebra (modern algebra, not high school algebra; math majors take a class in modern algebra after their basic calculus and linear algebra classes) says that for some positive integer b, any positive real number r can be expressed as r = a * bp, where a is in [1,b) and p is some integer. For example, 102410 = 1.02410*103. It is this theorem that justifies our use of scientific notation.

That number a can be classified as terminal (e.g. 1.0), repeating (1/3=0.333...), or non-repeating (the representation of pi). There's a minor issue here with terminal numbers. Any terminal number can be also be represented as a repeating number. For example, 0.999... and 1 are the same number. This ambiguity in representation can be resolved by specifying that numbers that can be represented as terminal numbers are represented as such.

What you have discovered is a consequence of the fact that all integers have a terminal representation in any base.

There is an issue here with how the reals are represented in a computer. Just as int and long long int don't represent all of integers, float and double don't represent all of the reals. The scheme used on most computer to represent a real number r is to represent in the form r = a*2p, but with the mantissa (or significand) a truncated to a certain number of bits and the exponent p limited to some finite number. What this means is that some integers cannot be represented exactly. For example, even though a googol (10100) is an integer, it's floating point representation is not exact. The base 2 representation of a googol is a 333 bit number. This 333 bit mantissa is truncated to 52+1 bits.

On consequence of this is that double precision arithmetic is no longer exact, even for integers if the integers in question are greater than 253. Try your experiment using the type unsigned long long int on values between 253 and 264. You'll find that double precision arithmetic is no longer exact for these large integers.

like image 69
David Hammen Avatar answered Dec 18 '22 11:12

David Hammen


I'm writing that under assumption that Javascript uses double-precision floating-point representation for all numbers.

Some numbers have an exact representation in the floating-point format, in particular, all integers such that |x| < 2^53. Some numbers don't, in particular, fractions such as 0.1 or 0.2 which become infinite fractions in binary representation.

If all operands and the result of an operation have an exact representation, then it would be safe to compare the result using ==.

Related questions:

What number in binary can only be represented as an approximation?

Why can't decimal numbers be represented exactly in binary?

like image 25
Andrey Avatar answered Dec 18 '22 11:12

Andrey