Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python's sum not returning same result as NumPy's numpy.sum

Tags:

python

numpy

Numpy's sum function is returning the correct expected result, but the default python's sum is not (at least not for uint8 datatype, which makes it even more confusing):

In [1]: import numpy as np                                                      

In [2]: x = np.random.randint(2, size = (1000,100))                             

In [3]: x                                                                       
Out[3]: 
array([[1, 1, 0, ..., 0, 1, 1],
       [1, 1, 1, ..., 0, 0, 0],
       [1, 1, 0, ..., 1, 0, 1],
       ...,
       [1, 0, 0, ..., 1, 0, 1],
       [0, 0, 1, ..., 0, 1, 1],
       [1, 1, 0, ..., 1, 1, 1]])

In [4]: np.sum(x)                                                               
Out[4]: 50318

In [5]: sum(sum(x))                                                             
Out[5]: 50318

In [6]: x = x.astype('uint8')                                                   

In [7]: np.sum(x)                                                               
Out[7]: 50318

In [8]: sum(sum(x))                                                             
Out[8]: 16014
like image 817
WalksB Avatar asked Oct 16 '22 06:10

WalksB


1 Answers

By specifying uint8 you are telling numpy to use 8 bits per element. The maximum number that can be stored in 8 bits is 255. So when you sum, you get an overflow.
Practical example:

>>> arr = np.array([[255],[1]],dtype=np.uint8)
>>> arr
array([[255],
       [1]], dtype=uint8)
>>> sum(arr)
array([0], dtype=uint8)
>> arr[0]+arr[1]
array([0], dtype=uint8)

Note that sum(arr) corresponds to arr[0] + arr[1] in this case. As stated in the docs:

Arithmetic is modular when using integer types, and no error is raised on overflow.

like image 160
abc Avatar answered Nov 15 '22 07:11

abc