Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find rows having same values in multiple columns(Not All Columns) in Pandas Dataframe

Below is my Dataframe:

X1  X2  X3  X4  X5
A   B   C   10  BAM
A   A   A   12  BAM
B   B   B   10  BAM
A   B   B   60  BAM

I want those rows having same values in columns(X1, X2,X3). Here we can see 2nd and 3rd rows are having same values for above 3 columns. My desired output is:

 X1 X2  X3  X4  X5
A   A   A   12  BAM
B   B   B   10  BAM

I tried like below:

yourdf1=df[df.nunique(0)==0]
print(yourdf1)

But here i am getting an error. Could anyone please help me.

like image 286
ssp Avatar asked Dec 23 '22 23:12

ssp


1 Answers

Select columns in list for test number of unique values per rows by axis=1 in DataFrame.nunique and test 1 for filter by boolean indexing:

yourdf1 = df[df[['X1','X2','X3']].nunique(axis=1) == 1]
print(yourdf1)
  X1 X2 X3  X4   X5
1  A  A  A  12  BAM
2  B  B  B  10  BAM

Another solution is use DataFrame.eq with filtered DataFrame, compare by first column and get all Trues per rows by DataFrame.all:

df1 = df[['X1','X2','X3']]
yourdf1 = df[df1.eq(df1.iloc[:, 0], axis=0).all(axis=1)]
print(yourdf1)

  X1 X2 X3  X4   X5
1  A  A  A  12  BAM
2  B  B  B  10  BAM
like image 97
jezrael Avatar answered May 13 '23 14:05

jezrael