I have below data frame:
Name1 Scr1 Name2 Scr2 Name3 Scr3
NY 21 CA 45 SF 37
AZ 31 BK 46 AK 23
I am trying to get the maximum value of each row and corresponding name for each row:
df.idxmax(axis=1)
But how do i get the corresponding name as well?
Expected Output:
Name Hi_Scr
CA 45
BK 46
To create the new column 'Max', use df['Max'] = df. idxmax(axis=1) . To find the row index at which the maximum value occurs in each column, use df. idxmax() (or equivalently df.
To find the maximum value of each row, call the max() method on the Dataframe object with an argument axis = 1.
We can find the maximum value index in a dataframe using the which. max() function. “$” is used to access particular column of a dataframe.
To find the maximum value of each column, call max () method on the Dataframe object without taking any argument. We can see that it returned a series of maximum values where the index is column name and values are the maxima from each column. How to find maximum values of every row?
Observe this dataset first. We’ll use ‘Weight’ and ‘Salary’ columns of this data in order to get the index of maximum values from a particular column in Pandas DataFrame. Code #1: Check the index at which maximum weight value is present.
Find highest value in a row and return column header with formula. 2. And then select the cell and drag the fill handle over to the range that you want to contain this formula, see screenshot: Note: In the above formula: B1: F1 is the headers row that you want to return, B2: F2 is the data range which contains the largest value you want to find.
To retrieve the column header of the largest value in a row, you can apply a combination of INDEX, MATCH and MAX functions to get the result. Please do as follows: 1.
I would do it with pd.wide_to_long
like this :
df['id'] = df.index
ndf = pd.wide_to_long(df, ["Name", "Scr"], i="id", j="number").reset_index(0).set_index('Name')
# id Scr
# Name
# NY 0 21
# AZ 1 31
# CA 0 45
# BK 1 46
# SF 0 37
# AK 1 23
# Thank you @jezrael for the improvement
ndf.groupby('id')['Scr'].agg(['max','idxmax']).rename(columns= {'max':'Hi_Scr','idxmax':'Name'})
Name Hi Scr
id
0 CA 45
1 BK 46
Use:
Scr
by filter
, convert values to numpy array by values
argmax
Name
and select by indexingDataFrame
by constructora = df.filter(like='Scr').values
b = a.argmax(axis=1)
c = df.filter(like='Name').values[np.arange(len(df.index)), b]
d = a.max(axis=1)
df = pd.DataFrame({'Name':c, 'Hi_Scr':d}, columns=['Name','Hi_Scr'])
print (df)
Name Hi_Scr
0 CA 45
1 BK 46
Pandas solution is very similar - create MultiIndex in columns by extract
, then select by xs
and for looking values use lookup
:
a = df.columns.to_series().str.extract('(\D+)(\d+)', expand=False)
df.columns = pd.MultiIndex.from_tuples(a.values.tolist())
a = df.xs('Scr', axis=1)
b = a.idxmax(axis=1)
c = df.xs('Name', axis=1).lookup(df.index, b)
d = a.max(axis=1)
df = pd.DataFrame({'Name':c, 'Hi_Scr':d}, columns=['Name','Hi_Scr'])
print (df)
Name Hi_Scr
0 CA 45
1 BK 46
Timings:
df = pd.concat([df]*10000).reset_index(drop=True)
def jez2(df):
a = df.columns.to_series().str.extract('(\D+)(\d+)', expand=False)
df.columns = pd.MultiIndex.from_tuples(a.values.tolist())
a = df.xs('Scr', axis=1)
b = a.idxmax(axis=1)
c = df.xs('Name', axis=1).lookup(df.index, b)
d = a.max(axis=1)
return pd.DataFrame({'Name':c, 'Hi_Scr':d}, columns=['Name','Hi_Scr'])
def jez1(df):
a = df.filter(like='Scr').values
b = a.argmax(axis=1)
c = df.filter(like='Name').values[np.arange(len(df.index)), b]
d = a.max(axis=1)
return pd.DataFrame({'Name':c, 'Hi_Scr':d}, columns=['Name','Hi_Scr'])
def dark(df):
df['id'] = df.index
ndf = pd.wide_to_long(df, ["Name", "Scr"], i="id", j="number").reset_index(0).set_index('Name')
return ndf.groupby('id')['Scr'].agg(['max','idxmax']).rename(columns= {'max':'Hi_Scr','idxmax':'Name'})
import time
t0 = time.time()
print (jez1(df).head())
t1 = time.time() - t0
print (t1)
print (dark(df).head())
t2 = time.time() - t1
print (t2)
print (jez2(df).head())
t3 = time.time() - t2
print (t3)
Name Hi_Scr
0 CA 45
1 BK 46
2 CA 45
3 BK 46
4 CA 45
#jez1 solution
0.015599966049194336
Hi_Scr Name
id
0 45 CA
1 46 BK
2 45 CA
3 46 BK
4 45 CA
#dark solution
1515070100.961423
Name Hi_Scr
0 CA 45
1 BK 46
2 CA 45
3 BK 46
4 CA 45
#jez2 solution
0.04679989814758301
Something like
df1=df.select_dtypes(include=[object])
df2=df.select_dtypes(exclude=[object])
pd.DataFrame({'Name':df1.values[np.where(df2.eq(df2.max(1),0))],'Scr':df2.max(1)})
Out[342]:
Name Scr
0 CA 45
1 BK 46
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With