Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TimeGrouper, pandas

Tags:

I use TimeGrouper from pandas.tseries.resample to sum monthly return to 6M as follows:

6m_return = monthly_return.groupby(TimeGrouper(freq='6M')).aggregate(numpy.sum) 

where monthly_return is like:

2008-07-01    0.003626 2008-08-01    0.001373 2008-09-01    0.040192 2008-10-01    0.027794 2008-11-01    0.012590 2008-12-01    0.026394 2009-01-01    0.008564 2009-02-01    0.007714 2009-03-01   -0.019727 2009-04-01    0.008888 2009-05-01    0.039801 2009-06-01    0.010042 2009-07-01    0.020971 2009-08-01    0.011926 2009-09-01    0.024998 2009-10-01    0.005213 2009-11-01    0.016804 2009-12-01    0.020724 2010-01-01    0.006322 2010-02-01    0.008971 2010-03-01    0.003911 2010-04-01    0.013928 2010-05-01    0.004640 2010-06-01    0.000744 2010-07-01    0.004697 2010-08-01    0.002553 2010-09-01    0.002770 2010-10-01    0.002834 2010-11-01    0.002157 2010-12-01    0.001034 

The 6m_return is like:

2008-07-31    0.003626 2009-01-31    0.116907 2009-07-31    0.067688 2010-01-31    0.085986 2010-07-31    0.036890 2011-01-31    0.015283 

However I want to get the 6m_return starting 6m from 7/2008 like the following:

2008-12-31    ... 2009-06-31    ... 2009-12-31    ... 2010-06-31    ... 2010-12-31    ... 

Tried the different input options (i.e. loffset) in TimeGrouper but doesn't work. Any suggestion will be really appreciated!

like image 684
user2019264 Avatar asked Jan 28 '13 19:01

user2019264


People also ask

What does pandas Grouper do?

Grouper. A Grouper allows the user to specify a groupby instruction for an object. This specification will select a column via the key parameter, or if the level and/or axis parameters are given, a level of the index of the target object.

Can you group by multiple columns in pandas?

Pandas comes with a whole host of sql-like aggregation functions you can apply when grouping on one or more columns. This is Python's closest equivalent to dplyr's group_by + summarise logic.

How do you count in Groupby pandas?

Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values. It works with non-floating type data as well.


2 Answers

The problem can be solved by adding closed = 'left'

df.groupby(pd.TimeGrouper('6M', closed = 'left')).aggregate(numpy.sum) 
like image 148
Anja Avatar answered Sep 20 '22 11:09

Anja


TimeGrouper that is suggested in other answers is deprecated and will be removed from Pandas. It is replaced with Grouper. So a solution to your question using Grouper is:

df.groupby(pd.Grouper(freq='6M', closed='left')).aggregate(numpy.sum) 
like image 40
Primoz Avatar answered Sep 18 '22 11:09

Primoz