Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas dataframe group year index by decade

Tags:

python

pandas

suppose I have a dataframe with index as monthy timestep, I know I can use dataframe.groupby(lambda x:x.year) to group monthly data into yearly and apply other operations. Is there some way I could quick group them, let's say by decade?

thanks for any hints.

like image 936
wiswit Avatar asked Jul 20 '13 17:07

wiswit


3 Answers

To get the decade, you can integer-divide the year by 10 and then multiply by 10. For example, if you're starting from

>>> dates = pd.date_range('1/1/2001', periods=500, freq="M")
>>> df = pd.DataFrame({"A": 5*np.arange(len(dates))+2}, index=dates)
>>> df.head()
             A
2001-01-31   2
2001-02-28   7
2001-03-31  12
2001-04-30  17
2001-05-31  22

You can group by year, as usual (here we have a DatetimeIndex so it's really easy):

>>> df.groupby(df.index.year).sum().head()
         A
2001   354
2002  1074
2003  1794
2004  2514
2005  3234

or you could do the (x//10)*10 trick:

>>> df.groupby((df.index.year//10)*10).sum()
           A
2000   29106
2010  100740
2020  172740
2030  244740
2040   77424

If you don't have something on which you can use .year, you could still do lambda x: (x.year//10)*10).

like image 128
DSM Avatar answered Nov 15 '22 19:11

DSM


if your Data Frame has Headers say : DataFrame ['Population','Salary','vehicle count']

Make your index as Year: DataFrame=DataFrame.set_index('Year')

use below code to resample data in decade of 10 years and also gives you some of all other columns within that dacade

datafame=dataframe.resample('10AS').sum()

like image 35
Shiv948 Avatar answered Nov 15 '22 18:11

Shiv948


Use the year attribute of index:

df.groupby(df.index.year)
like image 27
waitingkuo Avatar answered Nov 15 '22 19:11

waitingkuo