How do I find second and third largest values from multiple columns? It's clear how to find max value, min and median, however I cannot extract the second and third largest values as new columns?
import pandas as pd
df = pd.read_csv(...)
df['max'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]].max(axis=1)
df['min'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]].min(axis=1)
# ?
df['2nd_largest'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]]
To find the second largest values of each row, you can use nlargest; apply a function to each row:
df['2nd_largest'] = df[["A1", "B1", "C1", "D1", "E1", "F1"]].apply(lambda row: row.nlargest(2).iat[-1], axis=1)
If you values are in a list you can sort the values like:
df.sort()
and you can find the 2nd highest value like:
df[-2]
And alternative is a function I found on Get the second largest number in a list in linear time
def second_largest(numbers):
count = 0
m1 = m2 = float('-inf')
for x in numbers:
count += 1
if x > m2:
if x >= m1:
m1, m2 = x, m1
else:
m2 = x
return m2 if count >= 2 else None
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With