Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does "numpy.mean" return 'inf'?

Tags:

python

numpy

I need to calculate the mean in columns of an array with more than 1000 rows.

np.mean(some_array) gives me inf as output

but i am pretty sure the values are ok. I am loading a csv from here into my Data variable and column 'cement' is "healthy" from my point of view.

In[254]:np.mean(Data[:230]['Cement'])
Out[254]:275.75

but if I increase the number of rows the problem starts:

In [259]:np.mean(Data[:237]['Cement'])
Out[259]:inf

but when i look at the Data

In [261]:Data[230:237]['Cement']
Out[261]:
 array([[ 425. ],
        [ 333.  ],
        [ 250.25],
        [ 491.  ],
        [ 160.  ],
        [ 229.75],
        [ 338.  ]], dtype=float16)

i do not find a reason for this behaviour P.S This happens in Python 3.x using wakari (cloud based Ipython)

Numpy Version '1.8.1'

I am loading the Data with:

No_Col=9
conv = lambda valstr: float(valstr.replace(',','.'))

c={}
for i in range(0,No_Col,1):
    c[i] = conv

Data=np.genfromtxt(get_data,dtype=float16 , delimiter='\t', skip_header=0, names=True,   converters=c)
like image 227
www.pieronigro.de Avatar asked Jun 19 '14 18:06

www.pieronigro.de


People also ask

What does NumPy mean return?

Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

What does NumPy mean in Python?

NumPy Introduction NumPy is a Python library used for working with arrays. It also has functions for working in domain of linear algebra, fourier transform, and matrices. NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely. NumPy stands for Numerical Python.

How does NumPy mean work?

mean() in Python. The sum of elements, along with an axis divided by the number of elements, is known as arithmetic mean. The numpy. mean() function is used to compute the arithmetic mean along the specified axis.

Does NumPy mean ignore NaN?

nanmean() function can be used to calculate the mean of array ignoring the NaN value. If array have NaN value and we can find out the mean without effect of NaN value. axis: we can use axis=1 means row wise or axis=0 means column wise.


1 Answers

I will guess that the problem is precision (as others have also commented). Quoting directly from the documentation for mean() we see

Notes

The arithmetic mean is the sum of the elements along the axis divided by the number of elements.

Note that for floating-point input, the mean is computed using the same precision the input has. Depending on the input data, this can cause the results to be inaccurate, especially for float32 (see example below). Specifying a higher-precision accumulator using the dtype keyword can alleviate this issue.

Since your array is of type float16 you have very limited precision. Using dtype=np.float64 will probably alleviate the overflow. Also see the examples in the mean() documentation.

like image 137
Craig J Copi Avatar answered Oct 12 '22 12:10

Craig J Copi