I have a pandas dataframe in with several groups and I would like to exclude groups where some conditions (in a specific column) are not met. E.g. delete here group B because they have a non-number value in column "crit1".
I could delete specific columns based on the condition df.loc[:, (df >< 0).any(axis=0)]
but then it doesn't delete the whole group.
And somehow I can't make the next step and apply this to the whole group.
name crit1 crit2
A 0.3 4
A 0.7 6
B inf 4
B 0.4 3
So the result after this filtering (allow only floats) should be:
A 0.3 4
A 0.7 6
You can use groupby
and filter
, for the example you give you can check if np.inf
exists in a group and filter
on the condition:
import pandas as pd
import numpy as np
df.groupby('name').filter(lambda g: (g != np.inf).all().all())
# name crit1 crit2
# 0 A 0.3 4
# 1 A 0.7 6
If the predicate only applies to one column, you can access the column via g.
, for example:
df.groupby('name').filter(lambda g: (g.crit1 != np.inf).all())
# name crit1 crit2
# 0 A 0.3 4
# 1 A 0.7 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With