Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vectorization - negative result

So I'm dealing with Numpy-Vectors in this little example:

import numpy as np
import time

start = time.time()
print('sum is: ', np.sum(np.arange(1500000)))
end = time.time()
print("duration: ", end - start)

#sum is:  -282181552
#duration: 0.0041615962982177734

and as you can see, I always get a negative result for adding numbers from 1 to 15x10^5. However, when I use for loops, I get fine results:

start = time.time()
total = 0
for item in range(0, 1500000):
    total = total + item
print('sum is: ' + str(total))
end = time.time()
print("duration: ", end - start)

#sum is: 1124999250000
#duration:  0.09384274482727051

Any idea why?

like image 513
Mimo Avatar asked Jun 04 '26 22:06

Mimo


1 Answers

You get the wrong result due to an integer overflow.

By default, np.arange() creates arrays with a dtype=int32 on many platforms, especially 32-bit platforms. When you add up numbers of dtype int32, you're bound by the limitations of 32-bit integer arithmetic. Once your sum exceeds 2^32 - 1, it becomes negative.

The pure Python for loop doesn't have this problem because Python integers are of arbitrary precision (they don't overflow).

To fix the issue with the numpy version, you can specify a different dtype for the array:

print('sum is: ', np.sum(np.arange(1500000, dtype=np.int64)))

This will use 64-bit integers which have a much larger range.

like image 125
milos Avatar answered Jun 07 '26 12:06

milos



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!