Numpy's sum function is returning the correct expected result, but the default python's sum is not (at least not for uint8 datatype, which makes it even more confusing):
In [1]: import numpy as np
In [2]: x = np.random.randint(2, size = (1000,100))
In [3]: x
Out[3]:
array([[1, 1, 0, ..., 0, 1, 1],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 0, ..., 1, 0, 1],
...,
[1, 0, 0, ..., 1, 0, 1],
[0, 0, 1, ..., 0, 1, 1],
[1, 1, 0, ..., 1, 1, 1]])
In [4]: np.sum(x)
Out[4]: 50318
In [5]: sum(sum(x))
Out[5]: 50318
In [6]: x = x.astype('uint8')
In [7]: np.sum(x)
Out[7]: 50318
In [8]: sum(sum(x))
Out[8]: 16014
By specifying uint8
you are telling numpy to use 8 bits per element. The maximum number that can be stored in 8 bits is 255. So when you sum, you get an overflow.
Practical example:
>>> arr = np.array([[255],[1]],dtype=np.uint8)
>>> arr
array([[255],
[1]], dtype=uint8)
>>> sum(arr)
array([0], dtype=uint8)
>> arr[0]+arr[1]
array([0], dtype=uint8)
Note that sum(arr)
corresponds to arr[0] + arr[1]
in this case. As stated in the docs:
Arithmetic is modular when using integer types, and no error is raised on overflow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With