How to remove pair duplication in pandas? [duplicate]

Question

I have dataset, the dataset have pairing duplication. Here's my data

Id    antecedent           descendant
1     one                  two
2     two                  one
3     two                  three
4     one                  three
5     three                two

Here's what I need, because one, two is equals two, one so I want ro remove the duplicate pair

Id    antecedent           descendant
1     one                  two
3     two                  three
4     one                  three

jezrael · Accepted Answer

Use numpy.sort for sort per rows with duplicated for boolean mask:

df1 = pd.DataFrame(np.sort(df[['antecedent','descendant']], axis=1))

Or:

#slowier solution
#df1 = df[['antecedent','descendant']].apply(frozenset, 1)

df = df[~df1.duplicated()]
print (df)
   Id antecedent descendant
0   1        one        two
2   3        two      three
3   4        one      three

How to remove pair duplication in pandas? [duplicate]

Tags:

python

pandas

dataframe

duplicates

Nabih Bawazir

1 Answers

jezrael

Recent Activity

Donate For Us

How to remove pair duplication in pandas? [duplicate]

Tags:

python

pandas

dataframe

duplicates

Nabih Bawazir

1 Answers

jezrael

Related questions

Recent Activity

Donate For Us