Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare two columns of the same dataframe?

I have a dataframe like this:

 match_id inn1  bat  bowl  runs1 inn2   runs2   is_score_chased
    1     1     KKR  RCB    222  2      82          1
    2     1     CSK  KXIP   240  2      207         1
    8     1     CSK  MI     208  2      202         1
    9     1     DC   RR     214  2      217         1
   33     1     KKR  DC     204  2      181         1

Now i want to change the values in is_score_chased column by comparing the values in runs1 and runs2 . If runs1>runs2, then the corresponding value in the row should be 'yes' else it should be no. I tried the following code:

for i in (high_scores1):
  if(high_scores1['runs1']>=high_scores1['runs2']):
      high_scores1['is_score_chased']='yes'
  else:
      high_scores1['is_score_chased']='no' 

But it didn't work. How do i change the values in the column?

like image 902
user517696 Avatar asked Feb 23 '17 01:02

user517696


People also ask

How do you know if two columns match pandas?

Method. To find the positions of two matching columns, we first initialize a pandas dataframe with two columns of city names. Then we use where() of numpy to compare the values of two columns. This returns an array that represents the indices where the two columns have the same value.

How do you compare values in two data frames?

The compare method in pandas shows the differences between two DataFrames. It compares two data frames, row-wise and column-wise, and presents the differences side by side. The compare method can only compare DataFrames of the same shape, with exact dimensions and identical row and column labels.

How do you correlate two columns in pandas?

Initialize two variables, col1 and col2, and assign them the columns that you want to find the correlation of. Find the correlation between col1 and col2 by using df[col1]. corr(df[col2]) and save the correlation value in a variable, corr. Print the correlation value, corr.


1 Answers

You can more easily use np.where.

high_scores1['is_score_chased'] = np.where(high_scores1['runs1']>=high_scores1['runs2'], 
                                           'yes', 'no')

Typically, if you find yourself trying to iterate explicitly as you were to set a column, there is an abstraction like apply or where which will be both faster and more concise.

like image 84
miradulo Avatar answered Oct 13 '22 17:10

miradulo