Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Sum of two boolean series

Tags:

python

pandas

In Python:

In [1]: True+True
Out[1]: 2

So after the following set-up:

import pandas as pd
ser1 = pd.Series([True,True,False,False])
ser2 = pd.Series([True,False,True,False])

What I want is to find the element-wise sum of ser1 and ser2, with the booleans treated as integers for addition as in the Python example.

But Pandas treats the addition as an element-wise "or" operator, and gives the following (undesired) output:

In [5]: ser1+ser2
*/lib/python2.7/site-packages/pandas/computation/expressions.py:184: UserWarning: evaluating in Python space because the '+' operator is not supported by numexpr for the bool dtype, use '|' instead
  unsupported[op_str]))
Out[5]: 
0     True
1     True
2     True
3    False
dtype: bool

I know I get can get my desired output using astype(int) on either series:

In [6]: ser1.astype(int) + ser2
Out[6]: 
0    2
1    1
2    1
3    0
dtype: int64

Is there another (more "pandonic") way to get the [2,1,1,0] series? Is there a good explanation for why simple Series addition doesn't work here?

like image 252
exp1orer Avatar asked Aug 13 '14 16:08

exp1orer


1 Answers

Instead of + use &

import pandas as pd
ser1 = pd.Series([True,True,False,False])
ser2 = pd.Series([True,False,True,False]) 

print(ser1 & ser2) 

>> 0     True
>> 1    False
>> 2    False
>> 3    False
>> dtype: bool
like image 72
Charles Clayton Avatar answered Sep 22 '22 08:09

Charles Clayton