Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the row which has the maximum difference between two columns

Tags:

python

pandas

I have a DataFrame with columns Gold and Gold.1. I want to find the row where the difference of these two columns is the maximum.

For the following DataFrame, this should return me row 6.

df
Out: 
   Gold  Gold.1
0     2       1
1     1       4
2     6       9
3     4       4
4     4       8
5     5       5
6     5       2 ---> The difference is maximum (3)
7     5       9
8     5       3
9     5       6

I tried using the following:

df.where(max(df['Gold']-df['Gold.1']))

However that raised a ValueError:

df.where(max(df['Gold']-df['Gold.1']))
Traceback (most recent call last):

  File "", line 1, in 
    df.where(max(df['Gold']-df['Gold.1']))

  File "../python3.5/site-packages/pandas/core/generic.py", line 5195, in where
    raise_on_error)

  File "../python3.5/site-packages/pandas/core/generic.py", line 4936, in _where
    raise ValueError('Array conditional must be same shape as '

ValueError: Array conditional must be same shape as self

How can I find the row that satisfies this condition?


1 Answers

Instead of .where, you can use .idxmax:

(df['Gold'] - df['Gold.1']).idxmax()
Out: 6

This will return the index where the difference is maximum.

If you want to find the row with the maximum absolute difference, then you can call .abs() first.

(df['Gold'] - df['Gold.1']).abs().idxmax()
Out: 4

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!