Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas format datetimeindex to quarters

With a resample job, I have my monthly values converted to quarterly values:

hs=hs.resample('QS',axis=1).mean()

Works well, my columns look like this:

hs.columns:
DatetimeIndex(['2000-01-01', '2000-04-01', '2000-07-01', '2000-10-01',
           '2001-01-01', '2001-04-01', '2001-07-01', '2001-10-01',
           '2002-01-01', '2002-04-01', '2002-07-01', '2002-10-01',

Now I want them to convert in the YYYYq[1-4] format, which I thought should be as easy as (according to this Link):

hs.columns.strftime('%Yq%q')

But that gives:

array(['2000qq', '2000qq', '2000qq', '2000qq', '2001qq', '2001qq',
   '2001qq', '2001qq', '2002qq', '2002qq', '2002qq', '2002qq',
   '2003qq', '2003qq', '2003qq', '2003qq', '2004qq', '2004qq',

Where do I go wrong and how can i fix this?

like image 217
dr jerry Avatar asked Jan 02 '23 12:01

dr jerry


2 Answers

The documentation specifies strftime on Period data type not Datetime data type; To use %q formatter, you can convert the datetime Index to Period (days as unit) and then format it:

cols = pd.DatetimeIndex(['2000-01-01', '2000-04-01', '2000-07-01', '2000-10-01',
                         '2001-01-01', '2001-04-01', '2001-07-01', '2001-10-01',
                         '2002-01-01', '2002-04-01', '2002-07-01', '2002-10-01'])

cols.to_period('D').strftime('%Yq%q')
# hs.columns.to_period('D').strftime('%Yq%q')
#array([u'2000q1', u'2000q2', u'2000q3', u'2000q4', u'2001q1', u'2001q2',
#       u'2001q3', u'2001q4', u'2002q1', u'2002q2', u'2002q3', u'2002q4'],
#      dtype='<U6')

Or simply use to_period with Q (quarter) as unit:

cols.to_period('Q')
# hs.columns.to_period('Q')
#PeriodIndex(['2000Q1', '2000Q2', '2000Q3', '2000Q4', '2001Q1', '2001Q2',
#             '2001Q3', '2001Q4', '2002Q1', '2002Q2', '2002Q3', '2002Q4'],
#            dtype='period[Q-DEC]', freq='Q-DEC')
like image 136
Psidom Avatar answered Jan 05 '23 15:01

Psidom


One way it to use pd.Series.dt.to_period:

df = pd.DataFrame(columns=['2000-01-01', '2000-04-01', '2000-07-01', '2000-10-01',
                           '2001-01-01', '2001-04-01', '2001-07-01', '2001-10-01',
                           '2002-01-01', '2002-04-01', '2002-07-01', '2002-10-01'])

df.columns = pd.to_datetime(df.columns.to_series()).dt.to_period('Q')

print(df.columns)

# PeriodIndex(['2000Q1', '2000Q2', '2000Q3', '2000Q4', '2001Q1', '2001Q2',
#              '2001Q3', '2001Q4', '2002Q1', '2002Q2', '2002Q3', '2002Q4'],
#             dtype='period[Q-DEC]', freq='Q-DEC')
like image 26
jpp Avatar answered Jan 05 '23 15:01

jpp