Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Second and third largest values within multiple columns in Pandas

Tags:

python

pandas

How do I find second and third largest values from multiple columns? It's clear how to find max value, min and median, however I cannot extract the second and third largest values as new columns?

import pandas as pd

df = pd.read_csv(...)

df['max'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]].max(axis=1)

df['min'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]].min(axis=1)

# ? 
df['2nd_largest'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]]
like image 549
g123456k Avatar asked Feb 26 '26 19:02

g123456k


2 Answers

To find the second largest values of each row, you can use nlargest; apply a function to each row:

df['2nd_largest'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]].apply(lambda row: row.nlargest(2).iat[-1], axis=1)

If you values are in a list you can sort the values like:

df.sort()
 

and you can find the 2nd highest value like:

df[-2]

And alternative is a function I found on Get the second largest number in a list in linear time

def second_largest(numbers):
    count = 0
    m1 = m2 = float('-inf')
    for x in numbers:
        count += 1
        if x > m2:
            if x >= m1:
                m1, m2 = x, m1            
            else:
                m2 = x
    return m2 if count >= 2 else None
like image 24
Kylian Avatar answered Feb 28 '26 11:02

Kylian