I have monthly data. I want to convert it to "periods" of 3 months where q1 starts in January. So in the example below, the first 3 month aggregation would translate into start of q2 (desired format: 1996q2). And the data value that results from mushing together 3 monthly values is a mean (average) of 3 columns. Conceptually, not complicated. Does anyone know how to do it in one swoop? Potentially, I could do a lot of hard work through looping and just hardcode the hell out of it, but I am new to pandas and looking for something more clever than brute force.
1996-04 1996-05 1996-06 1996-07 ..... 25 19 37 40
So I am looking for:
1996q2 1996q3 1996q4 1997q1 1997q2 ..... avg avg avg ... ...
To find the quarter for each monthly period, simply use the following formula: =ROUNDUP(Month/3,0). The resulting value will be the quarter for a given month. So for instance, the quarter for month 5 will equal [=ROUNDUP(5/3,0)] or 2.
Pandas has a method to help you, it's called pd. PeriodIndex(monthcolumn, freq= 'Q') . You may need to convert your month column to datatype first by using datetime libray.
Unless you are willing to make assumptions, there is no way to convert yearly data into monthly or quarterly data. If you are willing to make the assumption that whatever it is you have data on happens at a uniform rate throughout the year then quarterly data would just be yearly data divided by 4.
you can use pd.PeriodIndex(..., freq='Q') in conjunction with groupby(..., axis=1):
In [63]: df Out[63]: 1996-04 1996-05 2000-07 2000-08 2010-10 2010-11 2010-12 0 1 2 3 4 1 1 1 1 25 19 37 40 1 2 3 2 10 20 30 40 4 4 5 In [64]: df.groupby(pd.PeriodIndex(df.columns, freq='Q'), axis=1).mean() Out[64]: 1996Q2 2000Q3 2010Q4 0 1.5 3.5 1.000000 1 22.0 38.5 2.000000 2 15.0 35.0 4.333333
UPDATE: to get columns in a resulting DF as strings intead of period
dtype:
In [66]: res = (df.groupby(pd.PeriodIndex(df.columns, freq='Q'), axis=1) .mean() .rename(columns=lambda c: str(c).lower())) In [67]: res Out[67]: 1996q2 2000q3 2010q4 0 1.5 3.5 1.000000 1 22.0 38.5 2.000000 2 15.0 35.0 4.333333 In [68]: res.columns.dtype Out[68]: dtype('O')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With