Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unexpected results when working with very big integers on interpreted languages

Python works:

>>> sum(x for x in xrange(1000000000 + 1))
500000000500000000

Or:

>>> sum(xrange(1000000000+1))
500000000500000000

Python's int auto promotes to a Python long which supports arbitrary precision. It will produce the correct answer on 32 or 64 bit platforms.

This can be seen by raising 2 to a power far greater than the bit width of the platform:

>>> 2**99
633825300114114700748351602688L

You can demonstrate (with Python) that the erroneous values you are getting in PHP is because PHP is promoting to a float when the values are greater than 2**32-1:

>>> int(sum(float(x) for x in xrange(1000000000+1)))
500000000067108992

Your Go code uses integer arithmetic with enough bits to give an exact answer. Never touched PHP or Node.js, but from the results I suspect the math is done using floating point numbers and should be thus expected not to be exact for numbers of this magnitude.


The reason is that the value of your integer variable sum exceeds the maximum value. And the sum you get is result of float-point arithmetic which involves rounding off. Since other answers did not mention the exact limits, I decided to post it.

The max integer value for PHP for:

  • 32-bit version is 2147483647
  • 64-bit version is 9223372036854775807

So it means either you are using 32 bit CPU or 32 bit OS or 32 bit compiled version of PHP. It can be found using PHP_INT_MAX. The sum would be calculated correctly if you do it on a 64 bit machine.

The max integer value in JavaScript is 9007199254740992. The largest exact integral value you can work with is 253 (taken from this question). The sum exceeds this limit.

If the integer value does not exceed these limits, then you are good. Otherwise you will have to look for arbitrary precision integer libraries.


Here is the answer in C, for completeness:

#include <stdio.h>

int main(void)
{
    unsigned long long sum = 0, i;

    for (i = 0; i <= 1000000000; i++)    //one billion
        sum += i;

    printf("%llu\n", sum);  //500000000500000000

    return 0;
}

The key in this case is using C99's long long data type. It provides the biggest primitive storage C can manage and it runs really, really fast. The long long type will also work on most any 32 or 64-bit machine.

There is one caveat: compilers provided by Microsoft explicitly do not support the 14 year-old C99 standard, so getting this to run in Visual Studio is a crapshot.


My guess is that when the sum exceeds the capacity of a native int (231-1 = 2,147,483,647), Node.js and PHP switch to a floating point representation and you start getting round-off errors. A language like Go will probably try to stick with an integer form (e.g., 64-bit integers) as long as possible (if, indeed, it didn't start with that). Since the answer fits in a 64-bit integer, the computation is exact.