I'm trying to drop a row at certain index in every group inside a GroupBy object.
The best I have been able to manage is:
import pandas as pd
x_train = x_train.groupby('ID')
x_train.apply(lambda x: x.drop([0], axis=0))
However, this doesn't work. I have spent a whole day on this to no solution, so have turned to stack.
Edit: A solution for any index value is needed as well
You can do it with cumcount
idx= x_train.groupby('ID').cumcount()
x_train = x_train[idx!=0]
The problem with using drop
inside the groupby
is the index numbers are still the same as before the groupby
. So when using drop([0])
, only the row that originally had 0
as index will be dropped. In the other groups, there will not be any row with index 0
as long as the index is unique.
If you want to use drop
then what you can do is to first use reset_index
inside the grouped data:
x_train.groupby('ID').apply(lambda x: x.reset_index().drop([0]))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With