Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Numpy mask NaN not working

Tags:

python

numpy

I'm simply trying to use a masked array to filter out some nanentries.

import numpy as np
# x = [nan, -0.35, nan]
x = np.ma.masked_equal(x, np.nan)
print x

This outputs the following:

masked_array(data = [        nan -0.33557216         nan],
         mask = False,
   fill_value = nan)

Calling np.isnan() on x returns the correct boolean array, but the mask just doesn't seem to work. Why would my mask not be working as I expect?

like image 782
chris Avatar asked Jan 26 '15 23:01

chris


People also ask

How do I mask a NaN in Python?

To mask an array where invalid values occur (NaNs or infs), use the numpy. ma. masked_invalid() method in Python Numpy. This function is a shortcut to masked_where, with condition = ~(np.

How do you read a masked array in Python?

Accessing the data The underlying data of a masked array can be accessed in several ways: through the data attribute. The output is a view of the array as a numpy. ndarray or one of its subclasses, depending on the type of the underlying data at the masked array creation.

Is NaN in Numpy?

To check for NaN values in a Numpy array you can use the np. isnan() method. This outputs a boolean mask of the size that of the original array. The output array has true for the indices which are NaNs in the original array and false for the rest.


2 Answers

You can use np.ma.masked_invalid:

import numpy as np

x = [np.nan, 3.14, np.nan]
mx = np.ma.masked_invalid(x)

print(repr(mx))
# masked_array(data = [-- 3.14 --],
#              mask = [ True False  True],
#        fill_value = 1e+20)

Alternatively, use np.isnan(x) as the mask= parameter to np.ma.masked_array:

print(repr(np.ma.masked_array(x, np.isnan(x))))
# masked_array(data = [-- 3.14 --],
#              mask = [ True False  True],
#        fill_value = 1e+20)

Why doesn't your original approach work? Because, rather counterintuitively, NaN is not equal to NaN!

print(np.nan == np.nan)
# False

This is actually part of the IEEE-754 definition of NaN

like image 86
ali_m Avatar answered Oct 19 '22 05:10

ali_m


Here is another alternative without using mask:

import numpy as np
#x = [nan, -0.35, nan]
xmask=x[np.logical_not(np.isnan(x))]
print(xmask)

Result:

array([-0.35])

like image 7
chuseuiti Avatar answered Oct 19 '22 07:10

chuseuiti