Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove rows that contain False in a column of pandas dataframe

Tags:

I assume this is an easy fix and I'm not sure what I'm missing. I have a data frame as such:

         index               c1       c2         c3 2015-03-07 01:27:05        False    False       True    2015-03-07 01:27:10        False    False       True    2015-03-07 01:27:15        False    False       False    2015-03-07 01:27:20        False    False       True    2015-03-07 01:27:25        False    False       False    2015-03-07 01:27:30        False    False       True    

I want to remove any rows that contain False in c3. c3 is a dtype=bool. I'm consistently running into problems since it's a boolean and not a string/int/etc, I haven't handled that before.

like image 364
Yolo_chicken Avatar asked May 13 '16 15:05

Yolo_chicken


1 Answers

Pandas deals with booleans in a really neat, straightforward manner:

df = df[df.c3] 

This does the same thing but without creating a copy (making it faster):

df = df.loc[df.c3, :] 

When you're filtering dataframes using df[...], you often write some function that returns a boolean value (like df.x > 2). But in this case, since the column is already a boolean, you can just put df.c3 in on its own, which will get you all the rows that are True.

If you wanted to get the opposite (as the original title to your question implied), you could use df[~df.c3] or df.loc[~df.c3, :], where the ~ inverts the booleans.

For more on boolean indexing in Pandas, see the docs. Thanks to @Mr_and_Mrs_D for the suggestion about .loc.

like image 87
ASGM Avatar answered Sep 20 '22 07:09

ASGM