Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove group if contains record with status 300

Tags:

python

pandas

I would like to group records by ID from df and delete group if any of records has STATUS = 300.

import pandas as pd


df1 = pd.DataFrame(
    {
        "ID": ["A0", "A0", "A0", "A1", "A1", "A1", "A2", "A2", "A2"],
        "STATUS": [100, 100, 300, 100, 100, 100, 300, 100, 100],
    },
    index=[0, 1, 2, 3, 4, 5, 6, 7, 8],
)

output:

   ID  STATUS
0  A0     100
1  A0     100
2  A0     300
3  A1     100
4  A1     100
5  A1     100
6  A2     300
7  A2     100
8  A2     100

I would like to get:

   ID  STATUS
0  A1     100
1  A1     100
2  A1     100

I tried: dfnew = df1.groupby('ID').filter(lambda x: x['STATUS'] != 300)

But I got the error: TypeError: filter function returned a Series, but expected a scalar bool

like image 353
datasciencebegginer Avatar asked Dec 10 '22 23:12

datasciencebegginer


2 Answers

df1.groupby('ID').filter(lambda x: 300 not in x['STATUS'].to_list())
like image 115
RJ Adriaansen Avatar answered Dec 21 '22 22:12

RJ Adriaansen


An efficient method to match any value from a list (see OP's comment) is to use isin coupled with groupby+transform:

df1[~df1['STATUS'].isin([300, 500]).groupby(df1['ID']).transform('any')]

output:

   ID  STATUS
3  A1     100
4  A1     100
5  A1     100
like image 45
mozway Avatar answered Dec 22 '22 00:12

mozway