Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boolean Indexing with multiple conditions [duplicate]

Tags:

python

pandas

I have a Pandas DF where I need to filter out some rows that contains values == 0 for feature 'a' and feature 'b'.

In order to inspect the values, I run the following:

DF1 = DF[DF['a'] == 0] 

Which returns the right values. Similarly, by doing this:

DF2 = DF[DF['b'] == 0] 

I can see the 0 values for feature 'b'.

However, if I try to combine these 2 in a single line of code using the OR operand:

DF3 = DF[DF['a'] == 0 |  DF['b'] == 0] 

I get this:

TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool] 

What's happening here?

like image 211
user5730994 Avatar asked Dec 30 '15 14:12

user5730994


1 Answers

You can transform either column 'a' or 'b' so they are both either float64 or bool. However, an easier solution that preserves the data type of your features is this:

DF3 = DF[(DF['a'] == 0) | (DF['b'] == 0)] 

A common operation is the use of boolean vectors to filter the data. The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses.

like image 194
Luis Miguel Avatar answered Sep 22 '22 23:09

Luis Miguel