Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python numpy.nan and logical functions: wrong results

I get some surprising results when trying to evaluate logical expressions on data that might contain nan values (as defined in numpy).

I would like to understand why this results arise and how to implement the correct way.

What I don't understand is why these expressions evaluate to the value they do:

from numpy import nan

nan and True
>>> True
# this is wrong.. I would expect to evaluate to nan

True and nan
>>> nan
# OK

nan and False
>>> False
# OK regardless the value of the first element 
# the expression should evaluate to False

False and nan
>>> False
#ok

Similarly for or:

True or nan
>>> True #OK

nan or True
>>> nan #wrong the expression is True

False or nan
>>> nan #OK

nan or False
>>> nan #OK

How can I implement (in an efficient way) the correct boolean functions, handling also nan values?

like image 908
lucacerone Avatar asked Jun 24 '13 10:06

lucacerone


1 Answers

You can use predicates from the numpy namespace:

>>> np.logical_and(True, np.nan), np.logical_and(False, np.nan)
(True, False)
>>> np.logical_and(np.nan, True), np.logical_and(np.nan, False)
(True, False)
>>>
>>> np.logical_or(True, np.nan), np.logical_or(False, np.nan)
(True, True)
>>> np.logical_or(np.nan, True), np.logical_or(np.nan, False)
(True, True)

EDIT: The built-in boolean operators are slightly different. From the docs : x and y is equivalent to if x is false, then x, else y. So, if the first argument evaluates to False, they return it (not its boolean equivalent, as it were). Therefore:

>>> (None and True) is None
True
>>> [] and True
[]
>>> [] and False
[]
>>> 

etc

like image 182
ev-br Avatar answered Oct 14 '22 08:10

ev-br