Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python pandas: remove duplicates row in each seperate section

I have a dataframe that looks like this:

A B
a T
b T
c F
d F
e F
f T
g T

I want to keep the last event of each section

Should turn into this:

A B
b T
e F    
g T
like image 989
rambo Avatar asked Mar 05 '23 01:03

rambo


1 Answers

Use:

df[df.B.ne(df.B.shift(-1))]

   A  B
1  b  T
4  e  F
6  g  T

Details- using df.shift() and periods= -1 will shift the column one above example below:

print(df.B.shift(-1)) 

0      T
1      F
2      F
3      F
4      T
5      T
6    NaN

using the above output, we do a comparison with the present row:

df.B.ne(df.B.shift(-1))
0    False
1     True
2    False
3    False
4     True
5    False
6     True

now we have a boolean output to which we can directly call the dataframe which will select all the True index.

like image 102
anky Avatar answered Mar 15 '23 02:03

anky