Given a dataframe with two datetime columns A
and B
and a numeric column C
, how to group by month
of both A
and B
and sum(C)
i.e.
In [1]: df
Out[1]:
A B C
0 2013-01-01 2013-01-01 0.282863
1 2013-01-02 2013-01-01 0.173215
2 2013-02-03 2013-02-04 2.104569
3 2013-02-09 2013-04-15 0.706771
4 2013-03-05 2013-08-01 0.567020
5 2013-03-06 2013-04-01 0.113648
By using groupby
df.groupby([df.A.dt.month,df.B.dt.month]).C.sum()
Out[954]:
A B
1 1 0.456078
2 2 2.104569
4 0.706771
3 4 0.113648
8 0.567020
Name: C, dtype: float64
Note: By using this , make sure A and B are datetime format If not , do following code before groupby
df.A=pd.to_datetime(df.A)
df.B=pd.to_datetime(df.B)
I recently just read about a new function that makes grouping by dates super easy.
df.A=pd.to_datetime(df.A)
df.B=pd.to_datetime(df.B)
df.groupby([pd.Grouper(key='A', freq='M'), pd.Grouper(key='B', freq='M')])['C'].sum()
The number of options this opens up makes it worth looking into:
Source: http://pbpython.com/pandas-grouper-agg.html
Different Date aliases: http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With