I'm working with dataset that has datetime info only for year-month as: 20110003 -> 2011-03. To retain the 2011-03 format I did the following:
#change 20110003 -> 2011-03
indicator_ccgs_re=indicator_ccgs.loc[:,'Time period Sortable'].astype(str)
old_pattern='00'
new_pattern='-'
new_dates=[]
for i, v in indicator_ccgs_re.items():
new_date = re.sub(old_pattern,new_pattern, v)
new_dates=new_dates+[new_date]
new_index=pd.to_datetime(new_dates,format='%Y%m%')
values_period=indicator_ccgs.loc['2012-01':'2012-06','Value']
type(new_index)
pandas.core.indexes.datetimes.DatetimeIndex
values_period.index
DatetimeIndex(['2012-01-01', '2012-02-01', '2012-03-01', '2012-04-01',
'2012-05-01', '2012-06-01'],
dtype='datetime64[ns]', freq=None)
So the day remains even though I specified format='%Y%m%'.
When plotting the values are monthly but tabular output still retains the days in the index.
I tried resampling
monthly=values_period.resample('M').sum()
monthly.index
But the days remain (only last rather than first month day):
DatetimeIndex(['2012-01-31', '2012-02-29', '2012-03-31', '2012-04-30',
'2012-05-31', '2012-06-30'],
dtype='datetime64[ns]', freq='M')
And trying:
dt=new_index.strptime('%Y-%m')
I got AttributeError: 'DatetimeIndex' object has no attribute 'strptime'
Any other solution to reomove the day from the index?
One straightforward method is to reset the index, then use lambda strftime, finally setting the index again in the new datetime format, i.e.
monthly = monthly.reset_index()
monthly['date'] = monthly['date'].apply(lambda x: x.strftime('%Y-%m'))
monthly.set_index('date', inplace=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With