I am trying to calculate a new column which contains maximum values for each of several groups. I'm coming from a Stata background so I know the Stata code would be something like this:
by group, sort: egen max = max(odds)
For example:
data = {'group' : ['A', 'A', 'B','B'], 'odds' : [85, 75, 60, 65]}
Then I would like it to look like:
group odds max A 85 85 A 75 85 B 60 65 B 65 65
Eventually I am trying to form a column that takes 1/(max-min) * odds
where max
and min
are for each group.
To find the maximum value of a column and to return its corresponding row values in Pandas, we can use df. loc[df[col]. idxmax()].
Pandas dataframe. max() method finds the maximum of the values in the object and returns it. If the input is a series, the method will return a scalar which will be the maximum of the values in the series.
How to groupby multiple columns in pandas DataFrame and compute multiple aggregations? groupby() can take the list of columns to group by multiple columns and use the aggregate functions to apply single or multiple aggregations at the same time.
Use groupby
+ transform
:
df['max'] = df.groupby('group')['odds'].transform('max')
This is equivalent to the verbose:
maxima = df.groupby('group')['odds'].max() df['max'] = df['group'].map(maxima)
The transform
method aligns the groupby
result to the groupby
indexer, so no explicit mapping is required.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With