numpy.mean precision for large arrays

Question

I do not understand why casting a float32-Array to a float64-Array changes the mean of the array significantly.

import numpy as n  

a = n.float32(100. * n.random.random_sample((10000000))+1000.)
b = a.astype(n.float64)        
print n.mean(a), a.dtype, a.shape
print n.mean(b), b.dtype, b.shape

result (should be approx. 1050, so float64 is correct):

1028.346368   float32 (10000000,)                                                          
1049.98284473 float64 (10000000,)

Jaime · Accepted Answer

@bogatron has explained what causes the loss in precision. To get around this kind of problem, np.mean has an optional dtype argument, that lets you specify what type to use for the internal operations. So you can do:

>>> np.mean(a)
1028.3446272000001
>>> np.mean(a.astype(np.float64))
1049.9776601123901
>>> np.mean(a, dtype=np.float64)
1049.9776601123901

The third case is significantly faster than the second, although slower than the first:

In [3]: %timeit np.mean(a)
100 loops, best of 3: 10.9 ms per loop

In [4]: %timeit np.mean(a.astype(np.float64))
10 loops, best of 3: 51 ms per loop

In [5]: %timeit np.mean(a, dtype=np.float64)
100 loops, best of 3: 19.2 ms per loop

numpy.mean precision for large arrays

Tags:

precision

numpy

user1514974

1 Answers

Jaime

Recent Activity

Donate For Us

numpy.mean precision for large arrays

Tags:

precision

numpy

user1514974

1 Answers

Jaime

Related questions

Recent Activity

Donate For Us