Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Long (>20million element) array summation in python numpy

Tags:

python

numpy

I am new to python and numpy so please excuse me if this problem is so rudimentary! I have an array of negative values (it is sorted):

>>>neg
[ -1.53507843e+02  -1.53200012e+02  -1.43161987e+02 ...,  -6.37326136e-1 -3.97518490e-10  -3.73480691e-10]
>>>neg.shape
(12922508,)

I need to add this array to its duplicate (but with positive values) to find the standard deviation of the distribution averaged to zero. So I do the following:

>>>pos=-1*neg
>>>pos=pos[::-1] #Just to make it look symmetric for the display bellow!
>>>total=np.hstack((neg,pos))
>>>total
[-153.50784302 -153.20001221 -143.1619873  ...,  143.1619873   153.20001221  153.50784302]
>>>total.shape
(25845016,)

So far everything is very good, but the strange thing is that the sum of this new array is not zero:

>>>numpy.sum(total)
11610.6

The standard deviation is also not at all near what I was expecting but I guess the root of that problem is the same as this: Why doesn't the sum result in zero?

When I apply this method to a small array; for example [-5, -3, -2] the sum becomes zero. So I guess the problem lies in the length of the array (over 20million elements). Is there any way to deal with this problem?

If any one could help me on this I would be most grateful.

like image 386
makhlaghi Avatar asked Dec 22 '11 04:12

makhlaghi


1 Answers

As noted in the comments, you get float roundoff problems from summing up many millions of equal-signed numbers. One possible way around this could be to mix positive and negative numbers in the combined array, so that any intermediate results while summing up always stay roughly within the same order of magnitude:

neg = -100*numpy.random.rand(20e6)
pos = -neg
combined = numpy.zeros(len(neg)+len(pos))
combined[::2] = neg
combined[1::2] = pos

Now combined.sum() should be pretty close to zero.

Maybe this approach will also help to improve the precision in the computation of the standard deviation.

like image 105
silvado Avatar answered Sep 29 '22 13:09

silvado