Below is a subset of a pandas
dataframe
I have and I am trying to remove multiple rows based on some conditions.
code1 code2 grp1 grp2 dist_km
0 M001 M002 AAA AAA 112
1 M001 M003 AAA IHH 275
2 M002 M005 AAA XXY 150
3 M002 M004 AAA AAA 65
4 M003 M443 IHH GRR 50
5 M003 M667 IHH IHH 647
6 M003 M664 IHH FFG 336
So I would only like to keep the rows where grp1
is the same as grp2
for each code1
but only where dist_km
is the smallest value for that specific code1
.
For the example above, only these rows will remain:
code1 code2 grp1 grp2 dist_km
0 M001 M002 AAA AAA 112
3 M002 M004 AAA AAA 65
What would be the easiest way to do this?
No need groupby
using sort_values
with drop_duplicates
df.sort_values('dist_km').drop_duplicates('code1').query('grp1==grp2')
code1 code2 grp1 grp2 dist_km
3 M002 M004 AAA AAA 65
0 M001 M002 AAA AAA 112
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With