Having below data set:
data_input:
A B
1 C13D C07H
2 C07H C13D
3 B42C B65H
4 B65H B42C
5 A45B A47C
i.e. row 1 and row 2 in data_input
are same,I just want to keep one,so drop row 2.
Want the Output as below:
data_output:
A B
1 C13D C07H
2 B42C B65H
3 A45B A47C
You can create a third column 'C'
based on 'A'
and 'B'
and use it to find duplicates as such:
df['C'] = df['A'] + df['B']
df['C'] = df['C'].apply(lambda x: ''.join(sorted(x)))
df = df.drop_duplicates(subset='C')[['A', 'B']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With