Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using PeriodIndex vs DateTimeIndex in pandas?

I am working with some financial data that is organized as a df with a MultiIndex that contains the ticker and the date and a column that contains the return. I am wondering whether one should convert the index to a PeriodIndex instead of a DateTimeIndex since returns are really over a period rather than an instant in time. Beside the philosophical argument, what practical functionality does PeriodIndex provide that may be useful in this particular use case vs DateTimeIndex?

like image 490
Alex Avatar asked Jun 01 '18 01:06

Alex


1 Answers

There are some functions available in DateTimeIndex (such as is_month_start, is_quarter_end) which are not available in PeriodIndex. I use PeriodIndex when is not possible to have the format I need with DateTimeIndex. For example if I need a monthly frequency in the format yyyy-mm, I use the PeriodIndex.

Example: Assume that df has an index as

df.index
'2020-02-26 13:50:00', '2020-02-27 14:20:00',
'2020-02-28 11:10:00', '2020-02-29 13:50:00'],
 dtype='datetime64[ns]', name='peak_time', length=1025, freq=None)

The minimum monthly data can be obtained via the following code

dfg = df.groupby([df.index.year, df.index.month]).min()

whose index is a MultiIndex

dfg.index
MultiIndex([(2017,  1),
             ...
            (2020,  1),
            (2020,  2)],
           names=['peak_time', 'peak_time'])

No I convert it to a PeriodIndex:

dfg["date"] = pd.PeriodIndex (dfg.index.map(lambda x: "{0}{1:02d}".format(*x)),freq="M")
like image 85
Matt Najarian Avatar answered Oct 01 '22 22:10

Matt Najarian