Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

compare two columns value in dataframe

I have a csv data frame like below, I'd like to compare two column value and generate third column, if value is same will return True, not same return False, how to compare with pandas python?

one two
1   a
2   b
3   a
4   b
5   5
6   6
7   7
8   8
9   9
10  10
like image 817
Kun OuYang Avatar asked Jun 20 '26 10:06

Kun OuYang


1 Answers

You need if values are mixed (string and int):

df['three'] = df.one == df.two

But need to_numeric if values are not mixed - dtype of first column is int and second is object what is obviously string and in column one are not NaN values, because to_numeric with parameter errors='coerce' return NaN for non numeric values:

print (pd.to_numeric(df.two, errors='coerce'))
0     NaN
1     NaN
2     NaN
3     NaN
4     5.0
5     6.0
6     7.0
7     8.0
8     9.0
9    10.0
Name: two, dtype: float64

df['three'] = df.one == pd.to_numeric(df.two, errors='coerce')
print (df)
   one two  three
0    1   a  False
1    2   b  False
2    3   a  False
3    4   b  False
4    5   5   True
5    6   6   True
6    7   7   True
7    8   8   True
8    9   9   True
9   10  10   True

Faster solution with Series.eq:

df['three'] = df.one.eq(pd.to_numeric(df.two, errors='coerce'))
print (df)
   one two  three
0    1   a  False
1    2   b  False
2    3   a  False
3    4   b  False
4    5   5   True
5    6   6   True
6    7   7   True
7    8   8   True
8    9   9   True
9   10  10   True
like image 85
jezrael Avatar answered Jun 23 '26 00:06

jezrael