I have a pandas dataframe, I want to check for each row if it has the same value at a particular column(let's call it porduct_type), and if it does, delete it. In other words, out of a group of consecutive rows with the same value at a particular column, I want to keep only one.
Example, if column A is the one on which we don't want consecutive duplicates:
input =
A B
0 1 1
0 2 2
2 1 10
2 2 20
0 11 100
5 2 200
output =
A B
0 1 1
2 1 10
0 11 100
5 2 200
It's a little tricky, but you could do something like
>>> df.groupby((df["A"] != df["A"].shift()).cumsum().values).first()
A B C
1 0 1 1
2 2 1 10
3 0 11 100
4 5 2 200
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With