Pandas dataframe select entire rows with highest values from a specified column

Question

I have a dataframe where I want to return the full row that contains the largest values out of a specified column. So let's say I create a dataframe like this:

df = pd.DataFrame(np.random.randint(0,100,size=(25, 4)), columns=list('ABCD'))

Then I'd have a table like this (sorry I can't get a proper table to form, so I just made a short one up):

A    B    C    D
14   67   35   22
75   21   34   64

And let's say it goes on for 25 rows like that. I want to take the top 5 largest values of column C and return those full rows.

If I do:

df['C'].nlargest()

it returns those 5 largest values, but I want it to return the full row.

I thought the below would work, but it gives me an error of "IndexError: indices are out-of-bounds":

df[df['C'].nlargest()]

I know this will be an easy solution for many people here, but it's stumped me. Thanks for your help.

MaxU - stop WAR against UA · Accepted Answer

you want to use columns parameter:

In [53]: df.nlargest(5, columns=['C'])
Out[53]:
     A   B   C   D
17  43  91  95  32
18  13  36  81  56
7   61  90  76  85
16  68  21  73  68
14   3  64  71  59

BENY · Answer

without using nlargest, by using sort_values

df.sort_values('C',ascending=False).iloc[:5,]

or using head

df.sort_values('C',ascending=False).head(5)

or using quantile

df[df.C>df.C.quantile(1-(5/len(df)))]

Pandas dataframe select entire rows with highest values from a specified column

Tags:

python

pandas

Emac

2 Answers

MaxU - stop WAR against UA

BENY

Recent Activity

Donate For Us

Pandas dataframe select entire rows with highest values from a specified column

Tags:

python

pandas

Emac

2 Answers

MaxU - stop WAR against UA

BENY

Related questions

Recent Activity

Donate For Us