Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare four columns of pandas dataframe at a time?

I have one dataframe.

Dataframe :

   Symbol1   BB Symbol2 CC 
0         ABC    1  ABC       1
1         PQR    1  PQR       1
2         CPC    2  CPC       0
3         CPC    2  CPC       1
4         CPC    2  CPC       2

I want to compare Symbol1 with Symbol2 and BB with CC, if they are same then I want that rows only other rows must be removed from the dataframe.

Expected Result :

Symbol1   BB Symbol2 CC 
0         ABC    1  ABC       1
1         PQR    1  PQR       1
2         CPC    2  CPC       2

If comparison between two rows then I'm using :

df = df[df['BB'] == '2'].copy()

It will work fine.

df = df[df['BB'] == df['offset'] and df['Symbol1'] == df['Symbol2']].copy()

It is giving me error.

Error :

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How I can compare and get expected result?

like image 444
ketan Avatar asked Dec 25 '22 01:12

ketan


1 Answers

You can use boolean indexing and compare with & instead and:

print ((df.Symbol1 == df.Symbol2) & (df.BB == df.CC))
0     True
1     True
2    False
3    False
4     True
dtype: bool

print (df[(df.Symbol1 == df.Symbol2) & (df.BB == df.CC)])
  Symbol1  BB Symbol2  CC
0     ABC   1     ABC   1
1     PQR   1     PQR   1
4     CPC   2     CPC   2
like image 146
jezrael Avatar answered Dec 26 '22 14:12

jezrael