Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas max value in a group as a new column

Tags:

I am trying to calculate a new column which contains maximum values for each of several groups. I'm coming from a Stata background so I know the Stata code would be something like this:

by group, sort: egen max = max(odds)  

For example:

data = {'group' : ['A', 'A', 'B','B'],     'odds' : [85, 75, 60, 65]} 

Then I would like it to look like:

    group    odds    max      A        85      85      A        75      85      B        60      65      B        65      65 

Eventually I am trying to form a column that takes 1/(max-min) * odds where max and min are for each group.

like image 337
Vicki Avatar asked Feb 25 '16 23:02

Vicki


People also ask

How do you find the max value in a column in Python?

To find the maximum value of a column and to return its corresponding row values in Pandas, we can use df. loc[df[col]. idxmax()].

How do you get max count in pandas?

Pandas dataframe. max() method finds the maximum of the values in the object and returns it. If the input is a series, the method will return a scalar which will be the maximum of the values in the series.

Can you use Groupby with multiple columns in pandas?

How to groupby multiple columns in pandas DataFrame and compute multiple aggregations? groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.


1 Answers

Use groupby + transform:

df['max'] = df.groupby('group')['odds'].transform('max') 

This is equivalent to the verbose:

maxima = df.groupby('group')['odds'].max() df['max'] = df['group'].map(maxima) 

The transform method aligns the groupby result to the groupby indexer, so no explicit mapping is required.

like image 98
jpp Avatar answered Sep 30 '22 14:09

jpp