Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame to drop rows in the groupby

I have a DataFrame with three columns Date, Advertiser and ID. I grouped the data firsts to see if volumns of some Advertisers are too small (For example when count() less than 500). And then I want to drop those rows in the group table.

df.groupby(['Date','Advertiser']).ID.count()

The result likes this:

 Date         Advertiser
 2016-01        A             50000
                B               50
                C              4000
                D             24000
 2016-02        A              6800
                B              7800
                C               123
 2016-03        B              1111
                E              8600
                F               500

I want a result to be this:

 Date         Advertiser
 2016-01        A             50000
                C              4000
                D             24000
 2016-02        A              6800
                B              7800
 2016-03        B              1111
                E              8600

Followed up question:

How about if I want to filter out the rows in groupby in term of the total count() in date category. For example, I want to count() for a date larger than 15000. The table I want likes this:

Date         Advertiser
 2016-01        A             50000
                B               50
                C              4000
                D             24000
 2016-02        A              6800
                B              7800
                C               123
like image 257
Zed Fang Avatar asked Mar 23 '17 03:03

Zed Fang


People also ask

How do you drop Pandas rows based on condition?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).

How do you drop rows in Pandas based on row value?

Use drop() method to delete rows based on column value in pandas DataFrame, as part of the data cleansing, you would be required to drop rows from the DataFrame when a column value matches with a static value or on another column value.

How do you drop a set of rows in Pandas?

To drop a specific row from the data frame – specify its index value to the Pandas drop function. It can be useful for selection and aggregation to have a more meaningful index. For our sample data, the “name” column would make a good index also, and make it easier to select country rows for deletion from the data.

How do I remove rows from a DataFrame in Python?

To delete a row from a DataFrame, use the drop() method and set the index label as the parameter.


1 Answers

You have a Series object after the groupby, which can be filtered based on value with a chained lambda filter:

df.groupby(['Date','Advertiser']).ID.count()[lambda x: x >= 500]

#Date     Advertiser
#2016-01  A             50000
#         C              4000
#         D             24000
#2016-02  A              6800
#         B              7800
#2016-03  B              1111
#         E              8600
#         F               500
like image 198
Psidom Avatar answered Sep 21 '22 20:09

Psidom