Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logical operation on two columns of a dataframe

In pandas, I'd like to create a computed column that's a boolean operation on two other columns.

In pandas, it's easy to add together two numerical columns. I'd like to do something similar with logical operator AND. Here's my first try:

In [1]: d = pandas.DataFrame([{'foo':True, 'bar':True}, {'foo':True, 'bar':False}, {'foo':False, 'bar':False}])  In [2]: d Out[2]:       bar    foo 0   True   True 1  False   True 2  False  False  In [3]: d.bar and d.foo   ## can't ... ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

So I guess logical operators don't work quite the same way as numeric operators in pandas. I tried doing what the error message suggests and using bool():

In [258]: d.bar.bool() and d.foo.bool()  ## spoiler: this doesn't work either ... ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 

I found a way that works by casting the boolean columns to int, adding them together and evaluating as a boolean.

In [4]: (d.bar.apply(int) + d.foo.apply(int)) > 0  ## Logical OR Out[4]:  0     True 1     True 2    False dtype: bool  In [5]: (d.bar.apply(int) + d.foo.apply(int)) > 1  ## Logical AND Out[5]:  0     True 1    False 2    False dtype: bool 

This is convoluted. Is there a better way?

like image 262
dinosaur Avatar asked Jan 27 '16 17:01

dinosaur


People also ask

How do logical operators work in pandas?

The operators are: | for or , & for and , and ~ for not . These must be grouped by using parentheses, since by default Python will evaluate an expression such as df. A > 2 & df. B < 3 as df.

How do I get two column combinations in pandas?

To get all combinations of columns we will be using itertools. product module. This function computes the cartesian product of input iterables. To compute the product of an iterable with itself, we use the optional repeat keyword argument to specify the number of repetitions.

Can we group by two columns in DataFrame?

Grouping by Multiple ColumnsYou can do this by passing a list of column names to groupby instead of a single string value.


2 Answers

Yes there is a better way! Just use the & element-wise logical and operator:

d.bar & d.foo  0     True 1    False 2    False dtype: bool 
like image 142
Kirell Avatar answered Sep 21 '22 21:09

Kirell


Also, there exists another one you could just multiply for AND or add for OR. Without the conversion and extra comparison as you had done.

AND operation:

d.foo * d.bar 

OR operation:

d.foo + d.bar  
like image 29
Mbuthia Avatar answered Sep 20 '22 21:09

Mbuthia