Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I obtain the second highest value in a row?

Tags:

python

pandas

I want to obtain the second highest value of a certain section for each row from a dataframe. How do I do this?

I have tried the following code but it doesn't work:

df.iloc[:, 5:-3].nlargest(2)(axis=1, level=2)

Is there any other way to obtain this?

like image 849
Gautham Kanthasamy Avatar asked Feb 06 '18 07:02

Gautham Kanthasamy


People also ask

How do you find the 2nd largest value in Excel?

In the 1st example find the 2nd largest value in column “Income” and in the 2nd one find the 2nd largest value in “Cost”. First we selected the max from that column in the table then we searched for the max value again in that column with excluding the max value which has already been found, so it results in the 2nd maximum value.

How to find second highest or smallest value in a range?

Here I introduce formulas to help you find the second highest or smallest value in a range. Select a blank cell, F1 for instance, type this formula =LARGE(A1:D8,2), and press Enter key to get the second largest value of the range. See screenshot:

How to find the 2nd maximum value in a table?

First we selected the max from that column in the table then we searched for the max value again in that column with excluding the max value which has already been found, so it results in the 2nd maximum value.

How many different ways to select second highest value in MS SQL?

11 different ways to select Second / Nth highest value in MS SQL Server Instance - Anyon Consulting, LLC. Minneapolis Minnesota Let’s discuss 11 different ways to select second highest value in MS SQL table. And as a bonus, 6 different ways to select the Nth highest value.


Video Answer


2 Answers

Using apply with axis=1 you can find the second largest value for each row. by finding the first 2 largest and then getting the last of them

df.iloc[:, 5:-3].apply(lambda row: row.nlargest(2).values[-1],axis=1)

Example

The code below find the second largest value in each row of df.

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: df = pd.DataFrame({'Col{}'.format(i):np.random.randint(0,100,5) for i in range(5)})

In [4]: df
Out[4]: 
   Col0  Col1  Col2  Col3  Col4
0    82    32    14    62    90
1    62    32    74    62    72
2    31    79    22    17     3
3    42    54    66    93    50
4    13    88     6    46    69

In [5]: df.apply(lambda row: row.nlargest(2).values[-1],axis=1)
Out[5]: 
0    82
1    72
2    31
3    66
4    69
dtype: int64
like image 118
sgDysregulation Avatar answered Sep 22 '22 06:09

sgDysregulation


I think you need sorting per rows and then select:

a = np.sort(df.iloc[:, 5:-3], axis=1)[:, -2]

Sample:

np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(10,10)))
print (df)
   0  1  2  3  4  5  6  7  8  9
0  8  8  3  7  7  0  4  2  5  2
1  2  2  1  0  8  4  0  9  6  2
2  4  1  5  3  4  4  3  7  1  1
3  7  7  0  2  9  9  3  2  5  8
4  1  0  7  6  2  0  8  2  5  1
5  8  1  5  4  2  8  3  5  0  9
6  3  6  3  4  7  6  3  9  0  4
7  4  5  7  6  6  2  4  2  7  1
8  6  6  0  7  2  3  5  4  2  4
9  3  7  9  0  0  5  9  6  6  5

print (df.iloc[:, 5:-3])
   5  6
0  0  4
1  4  0
2  4  3
3  9  3
4  0  8
5  8  3
6  6  3
7  2  4
8  3  5
9  5  9

a = np.sort(df.iloc[:, 5:-3], axis=1)[:, -2]
print (a)
[0 0 3 3 0 3 3 2 3 5]

If need both values:

a = df.iloc[:, 5:-3].values
b = pd.DataFrame(a[np.arange(len(a))[:, None], np.argsort(a, axis=1)])
print (b)
   0  1
0  0  4
1  0  4
2  3  4
3  3  9
4  0  8
5  3  8
6  3  6
7  2  4
8  3  5
9  5  9
like image 41
jezrael Avatar answered Sep 23 '22 06:09

jezrael