Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove days from pandas DatetimeIndex

I'm working with dataset that has datetime info only for year-month as: 20110003 -> 2011-03. To retain the 2011-03 format I did the following:

#change 20110003 -> 2011-03 
        indicator_ccgs_re=indicator_ccgs.loc[:,'Time period Sortable'].astype(str)
        old_pattern='00'
        new_pattern='-'
        new_dates=[]
        for i, v in indicator_ccgs_re.items():
            new_date = re.sub(old_pattern,new_pattern, v)
            new_dates=new_dates+[new_date]
        new_index=pd.to_datetime(new_dates,format='%Y%m%')
        values_period=indicator_ccgs.loc['2012-01':'2012-06','Value']
        type(new_index)

pandas.core.indexes.datetimes.DatetimeIndex

values_period.index

DatetimeIndex(['2012-01-01', '2012-02-01', '2012-03-01', '2012-04-01',
               '2012-05-01', '2012-06-01'],
              dtype='datetime64[ns]', freq=None)

So the day remains even though I specified format='%Y%m%'.

When plotting the values are monthly but tabular output still retains the days in the index.

I tried resampling

monthly=values_period.resample('M').sum()
monthly.index

But the days remain (only last rather than first month day):

DatetimeIndex(['2012-01-31', '2012-02-29', '2012-03-31', '2012-04-30',
               '2012-05-31', '2012-06-30'],
              dtype='datetime64[ns]', freq='M')

And trying:

dt=new_index.strptime('%Y-%m')

I got AttributeError: 'DatetimeIndex' object has no attribute 'strptime'

Any other solution to reomove the day from the index?

like image 744
Rony Armon Avatar asked Dec 24 '22 07:12

Rony Armon


1 Answers

One straightforward method is to reset the index, then use lambda strftime, finally setting the index again in the new datetime format, i.e.

monthly = monthly.reset_index()
monthly['date'] = monthly['date'].apply(lambda x: x.strftime('%Y-%m'))
monthly.set_index('date', inplace=True)
like image 59
chinafundnews Avatar answered Jan 16 '23 07:01

chinafundnews