I want to obtain the second highest value of a certain section for each row from a dataframe. How do I do this?
I have tried the following code but it doesn't work:
df.iloc[:, 5:-3].nlargest(2)(axis=1, level=2)
Is there any other way to obtain this?
In the 1st example find the 2nd largest value in column “Income” and in the 2nd one find the 2nd largest value in “Cost”. First we selected the max from that column in the table then we searched for the max value again in that column with excluding the max value which has already been found, so it results in the 2nd maximum value.
Here I introduce formulas to help you find the second highest or smallest value in a range. Select a blank cell, F1 for instance, type this formula =LARGE(A1:D8,2), and press Enter key to get the second largest value of the range. See screenshot:
First we selected the max from that column in the table then we searched for the max value again in that column with excluding the max value which has already been found, so it results in the 2nd maximum value.
11 different ways to select Second / Nth highest value in MS SQL Server Instance - Anyon Consulting, LLC. Minneapolis Minnesota Let’s discuss 11 different ways to select second highest value in MS SQL table. And as a bonus, 6 different ways to select the Nth highest value.
Using apply with axis=1 you can find the second largest value for each row. by finding the first 2 largest and then getting the last of them
df.iloc[:, 5:-3].apply(lambda row: row.nlargest(2).values[-1],axis=1)
Example
The code below find the second largest value in each row of df.
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: df = pd.DataFrame({'Col{}'.format(i):np.random.randint(0,100,5) for i in range(5)})
In [4]: df
Out[4]:
Col0 Col1 Col2 Col3 Col4
0 82 32 14 62 90
1 62 32 74 62 72
2 31 79 22 17 3
3 42 54 66 93 50
4 13 88 6 46 69
In [5]: df.apply(lambda row: row.nlargest(2).values[-1],axis=1)
Out[5]:
0 82
1 72
2 31
3 66
4 69
dtype: int64
I think you need sorting per rows and then select:
a = np.sort(df.iloc[:, 5:-3], axis=1)[:, -2]
Sample:
np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(10,10)))
print (df)
0 1 2 3 4 5 6 7 8 9
0 8 8 3 7 7 0 4 2 5 2
1 2 2 1 0 8 4 0 9 6 2
2 4 1 5 3 4 4 3 7 1 1
3 7 7 0 2 9 9 3 2 5 8
4 1 0 7 6 2 0 8 2 5 1
5 8 1 5 4 2 8 3 5 0 9
6 3 6 3 4 7 6 3 9 0 4
7 4 5 7 6 6 2 4 2 7 1
8 6 6 0 7 2 3 5 4 2 4
9 3 7 9 0 0 5 9 6 6 5
print (df.iloc[:, 5:-3])
5 6
0 0 4
1 4 0
2 4 3
3 9 3
4 0 8
5 8 3
6 6 3
7 2 4
8 3 5
9 5 9
a = np.sort(df.iloc[:, 5:-3], axis=1)[:, -2]
print (a)
[0 0 3 3 0 3 3 2 3 5]
If need both values:
a = df.iloc[:, 5:-3].values
b = pd.DataFrame(a[np.arange(len(a))[:, None], np.argsort(a, axis=1)])
print (b)
0 1
0 0 4
1 0 4
2 3 4
3 3 9
4 0 8
5 3 8
6 3 6
7 2 4
8 3 5
9 5 9
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With