Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Max value comparing multiple columns and return specific values

I have a Dataframe like:

Sequence    Duration1   Value1  Duration2   Value2  Duration3   Value3
1001        145         10      125         53      458         33
1002        475         20      175         54      652         45
1003        685         57      687         87      254         88
1004        125         54      175         96      786         96
1005        475         21      467         32      526         32
1006        325         68      301         54      529         41
1007        125         97      325         85      872         78
1008        129         15      429         41      981         82
1009        547         47      577         52      543         83
1010        666         65      722         63      257         87

I want to find the maximum value of Duration in (Duration1,Duration2,Duration3) and return the corresponding Value & Sequence.

My Desired Output:

Sequence,Duration3,Value3
1008,    981,      82
like image 221
Rohit Lamba K Avatar asked Sep 21 '19 09:09

Rohit Lamba K


People also ask

How can use MAX function in SQL with multiple columns?

If you want to understand VALUE try this query which creates a virtual 1 column table: SELECT * FROM (VALUES (1), (5), (1)) as listOfValues(columnName) And this query which creates a virtual 2 column table: SELECT * FROM (VALUES (1,2), (5,3), (1,4)) as tableOfValues(columnName1, ColumnName2) Now you can understand why ...

How do you find the max value across multiple columns in R?

Maximum value of a column in R can be calculated by using max() function. Max() Function takes column name as argument and calculates the maximum value of that column. Maximum of single column in R, Maximum of multiple columns in R using dplyr.


2 Answers

Try the following, quite short code, based mainly on Numpy:

vv = df.iloc[:, 1::2].values
iRow, iCol = np.unravel_index(vv.argmax(), vv.shape)
iCol = iCol * 2 + 1
result = df.iloc[iRow, [0, iCol, iCol + 1]]

The result is a Series:

Sequence     1008
Duration3     981
Value3         82
Name: 7, dtype: int64

If you want to "rehape" it (first index values, then actual values), you can get something like this executing:

pd.DataFrame([result.values], columns=result.index)
like image 93
Valdi_Bo Avatar answered Oct 23 '22 23:10

Valdi_Bo


With wide data it can be easier to first reshape with wide_to_long. This creates 2 columns ['Duration', 'Value'], and the MultiIndex tells us which number it was. There is no reliance on any specific column ordering.

import pandas as pd

df = pd.wide_to_long(df, i='Sequence', j='num', stubnames=['Duration', 'Value'])
df.loc[[df.Duration.idxmax()]]

              Duration  Value
Sequence num                 
1008     3         981     82
like image 34
ALollz Avatar answered Oct 24 '22 01:10

ALollz