I want to do an element-wise OR operation on two pandas Series of boolean values. np.nan
s are also included.
I have tried three approaches and realized that the expression "np.nan
or False
" can be evaluted to True
, False
, and np.nan
depending on the approach.
These are my example Series:
series_1 = pd.Series([True, False, np.nan])
series_2 = pd.Series([False, False, False])
Using the |
operator of pandas:
In [5]: series_1 | series_2
Out[5]:
0 True
1 False
2 False
dtype: bool
Using the logical_or
function from numpy:
In [6]: np.logical_or(series_1, series_2)
Out[6]:
0 True
1 False
2 NaN
dtype: object
I define a vectorized version of logical_or
which is supposed to be evaluated row-by-row over the arrays:
@np.vectorize
def vectorized_or(a, b):
return np.logical_or(a, b)
I use vectorized_or
on the two series and convert its output (which is a numpy array) into a pandas Series:
In [8]: pd.Series(vectorized_or(series_1, series_2))
Out[8]:
0 True
1 False
2 True
dtype: bool
I am wondering about the reasons for these results.
This answer explains np.logical_or
and says np.logical_or(np.nan, False)
is be True
but why does this only works when vectorized and not in Approach #2? And how can the results of Approach #1 be explained?
To check for NaN values in a Numpy array you can use the np. isnan() method. This outputs a boolean mask of the size that of the original array. The output array has true for the indices which are NaNs in the original array and false for the rest.
The math. isnan() method checks whether a value is NaN (Not a Number), or not. This method returns True if the specified value is a NaN, otherwise it returns False.
first difference : |
is np.bitwise_or
. it explains the difference between #1 and #2.
Second difference : since serie_1.dtype if object
(non homogeneous data), operations are done row by row in the two first cases.
When using vectorize ( #3):
The data type of the output of
vectorized
is determined by calling the function with the first element of the input. This can be avoided by specifying theotypes
argument.
For vectorized operations, you quit the object mode. data are first converted according to first element (bool here, bool(nan)
is True
) and the operations are done after.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With