Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore NaN in numpy bincount in python

Tags:

python

nan

numpy

I have a 1D array and I want to use numpy bincount to create a histogram. It works OK, but I want it to ignore NaN values.

histogram = np.bincount(distancesArray, weights=intensitiesArray) / np.bincount(distancesArray)

How can I do that?

Thanks for your help!

like image 466
betelgeuse Avatar asked Jun 14 '14 12:06

betelgeuse


2 Answers

Here's what I think your problem is:

import numpy

w = numpy.array([0.3, float("nan"), 0.2, 0.7, 1., -0.6]) # weights
x = numpy.array([0, 1, 1, 2, 2, 2])
numpy.bincount(x,  weights=w)
#>>> array([ 0.3,  nan,  1.1])

The solution is just to use indexing to only keep the non-nan weights:

keep = ~numpy.isnan(w)
numpy.bincount(x[keep],  weights=w[keep])
#>>> array([ 0.3,  0.2,  1.1])
like image 122
Veedrac Avatar answered Oct 06 '22 00:10

Veedrac


You cannot have NaN in an integer valued array. If you try to call np.bincount, it is going to complain:

TypeError: Cannot cast array data from dtype('float64') to dtype('int64') according to the rule 'safe'

If you do the casting (.astype(int)), you will get crazy values, like -9223372036854775808. You can overcome this by selecting the non NaN values:

mask = ~np.logical_or(np.isnan(distancesArray), np.isnan(intensitiesArray))
histogram = np.bincount(distancesArray[mask].astype(int), 
                        weights=intensitiesArray[mask])
like image 38
Davidmh Avatar answered Oct 05 '22 23:10

Davidmh