Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to drop rows from pandas based on several numerical value conditions within subsets of each dataframe?

Tags:

python

pandas

I have a df that looks like this:

Account 1   Pre       9
Account 1   Pre       9
Account 1   During    5
Account 1   Post      5
Account 1   Post      5
Account 2   Pre       11
Account 2   During    9
Account 2   Post      7
Account 2   Post      7
Account 2   Post      7
Account 2   Post      7
Account 3   Pre       1
Account 3   During    2
Account 3   During    2
Account 3   Post      3

I am trying to drop all rows for each account if Pre, During, and Post are all less than 10. So in the example above we would lose all of the Account 1 rows and all of the Account 3 rows but keep all Account 2 rows because there in a single row that has 11.

I'm relatively new to pandas and python but I'm thinking something following the logic below might work:

for each Account in Account:
    if 'Pre' > 10 AND 'During' > 10 AND 'Post' > 10
    return (df_updated)

This df_updated should be composed of only the Account 2 I believe. I don't think I can just take the results of this for loop though and return a new df directly though so I am not quite sure how to do this.

Thank you for any help you can provide!

like image 288
wharfchillin Avatar asked Dec 05 '25 18:12

wharfchillin


2 Answers

Data

print(df)

 Account  Status  Count
0   Account1     Pre      9
1   Account1     Pre      9
2   Account1  During      5
3   Account1    Post      5
4   Account1    Post      5
5   Account2     Pre     11
6   Account2  During      9
7   Account2    Post      7
8   Account2    Post      7
9   Account2    Post      7
10  Account2    Post      7
11  Account3     Pre      1
12  Account3  During      2
13  Account3  During      2
14  Account3    Post      3



df[df.groupby('Account')['Count'].transform(lambda x: x.gt(10).any())]



 Account  Status  Count
5   Account2     Pre     11
6   Account2  During      9
7   Account2    Post      7
8   Account2    Post      7
9   Account2    Post      7
10  Account2    Post      7
like image 150
wwnde Avatar answered Dec 08 '25 08:12

wwnde


Let say your df has 3 columns:

Accountname type      value
Account 1   Pre       9
Account 1   Pre       9
Account 1   During    5
Account 1   Post      5
Account 1   Post      5
Account 2   Pre       11
Account 2   During    9
Account 2   Post      7
Account 2   Post      7
Account 2   Post      7
Account 2   Post      7
Account 3   Pre       1
Account 3   During    2
Account 3   During    2
Account 3   Post      3   

You dont need such complicated scripts, you can easily filter it with :

df= df[lambda x: x['accountname'].isin(df[df['value']>10].accountname)]

output:

Account 2   Pre       11
Account 2   During    9
Account 2   Post      7
Account 2   Post      7
Account 2   Post      7
Account 2   Post      7
like image 28
Mehdi Golzadeh Avatar answered Dec 08 '25 07:12

Mehdi Golzadeh



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!