I'm having this data frame:
Name Date Quantity Apple 07/11/17 20 orange 07/14/17 20 Apple 07/14/17 70 Orange 07/25/17 40 Apple 07/20/17 30
I want to aggregate this by Name
and Date
to get sum of quantities Details:
Date: Group, the result should be at the beginning of the week (or just on Monday)
Quantity: Sum, if two or more records have same Name and Date (if falls on same interval)
The desired output is given below:
Name Date Quantity Apple 07/10/17 90 orange 07/10/17 20 Apple 07/17/17 30 orange 07/24/17 40
Thanks in advance
How to perform groupby index in pandas? Pass index name of the DataFrame as a parameter to groupby() function to group rows on an index. DataFrame. groupby() function takes string or list as a param to specify the group columns or index.
Grouper. A Grouper allows the user to specify a groupby instruction for an object. This specification will select a column via the key parameter, or if the level and/or axis parameters are given, a level of the index of the target object.
Pandas Time Series Data Structures As mentioned before, it is essentially a replacement for Python's native datetime , but is based on the more efficient numpy. datetime64 data type. The associated Index structure is DatetimeIndex . For time Periods, Pandas provides the Period type.
First convert column date
to_datetime
and substract one week, as we want to sum for the week ahead of the date, not the week before that date.
Then use groupby
with Grouper
by W-MON and aggregate sum
:
df['Date'] = pd.to_datetime(df['Date']) - pd.to_timedelta(7, unit='d') df = df.groupby(['Name', pd.Grouper(key='Date', freq='W-MON')])['Quantity'] .sum() .reset_index() .sort_values('Date') print (df) Name Date Quantity 0 Apple 2017-07-10 90 3 orange 2017-07-10 20 1 Apple 2017-07-17 30 2 Orange 2017-07-24 40
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With