In pure Python, None or True
returns True
.
However with pandas when I'm doing a |
between two Series containing None values, results are not as I expected:
>>> df.to_dict()
{'buybox': {0: None}, 'buybox_y': {0: True}}
>>> df
buybox buybox_y
0 None True
>>> df['buybox'] = (df['buybox'] | df['buybox_y'])
>>> df
buybox buybox_y
0 False True
Expected result:
>>> df
buybox buybox_y
0 True True
I get the result I want by applying the OR operation twice, but I don't get why I should do this.
I'm not looking for a workaround (I have it by applying df['buybox'] = (df['buybox'] | df['buybox_y'])
twice in a row) but an explanation, thus the 'why' in the title.
Pandas |
operator does not rely on Python or expression
, and behaves differently.
If both operands are boolean, the result is mathematically defined, and the same for Python and Pandas.
But in your case series "buybox" is of type object
, and "buybox_y" is bool
. In this case Pandas |
operator is not commutative:
bitwise or
is attempted
None | True
is invalid operation, resulting in None
Thus,
>>> df['buybox'] | df['buybox_y']
0 False
>>> df['buybox_y'] | df['buybox']
0 True
For predictable results, you can clean up data, and cast to boolean type with Pandas astype
before attempting boolean operations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With