import pandas as pd
import numpy as np
from datetime import datetime
df = pd.DataFrame(
{
'Date' : np.random.choice(pd.date_range(datetime(2020,1,1),periods=5),20),
'Product' : np.random.choice(['Milk','Brandy','Beer'],20) ,
'Quantity' : np.random.randint(10,99,20)
}
)
df.groupby(['Date','Product']).sum()
This will give,
I would like to get the max values of the sum within the group what is the best way to do it?
Expected result for my random sample value would be.
How can I achieve this result.
You can chain with another groupby, this time on your first level of your index (product) and get the max:
df.groupby(['Date','Product']).sum().groupby(level=1).max()
Quantity
Product
Beer 160
Brandy 97
Milk 245
To get the date as well, use sort_values
with tail
:
(
df.groupby(['Date','Product']).sum()
.sort_values('Quantity')
.groupby(level=1)
.tail(1)
)
Date Product Quantity
0 2020-01-04 Beer 81
1 2020-01-03 Milk 186
2 2020-01-03 Brandy 212
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With