Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding highest values in each row in a data frame for python

I'd like to find the highest values in each row and return the column header for the value in python. For example, I'd like to find the top two in each row:

df =  
       A    B    C    D  
       5    9    8    2  
       4    1    2    3  

I'd like my for my output to look like this:

df =        
       B    C  
       A    D
like image 999
Milhouse Avatar asked Dec 29 '15 20:12

Milhouse


People also ask

How do you find the maximum value of each row in a DataFrame?

To find the maximum value of each row, call the max() method on the Dataframe object with an argument axis = 1.

How do you get max rows in Pandas?

Find Maximum Element in Pandas DataFrame's Row Finding the max element of each DataFrame row relies on the max() method as well, but we set the axis argument to 1 . The default value for the axis argument is 0. If the axis equals to 0, the max() method will find the max element of each column.

How will you find the top 5 records of a DataFrame in Python?

pandas.DataFrame.head() In Python's Pandas module, the Dataframe class provides a head() function to fetch top rows from a Dataframe i.e. It returns the first n rows from a dataframe. If n is not provided then default value is 5.


1 Answers

You can use a dictionary comprehension to generate the largest_n values in each row of the dataframe. I transposed the dataframe and then applied nlargest to each of the columns. I used .index.tolist() to extract the desired top_n columns. Finally, I transposed this result to get the dataframe back into the desired shape.

top_n = 2
>>> pd.DataFrame({n: df.T[col].nlargest(top_n).index.tolist() 
                  for n, col in enumerate(df.T)}).T
   0  1
0  B  C
1  A  D
like image 72
Alexander Avatar answered Sep 21 '22 20:09

Alexander