Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert pandas multi-index to pandas timestamp

I'm trying to convert an unstacked, multi-indexed data-frame back to a single pandas datetime index.

The index of my original data-frame, i.e. before multi-indexing and unstacking, looks like this:

In [1]: df1_season.index
Out [1]: 

<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-01 02:00:00, ..., 2014-07-31 23:00:00]
Length: 1472, Freq: None, Timezone: None

then I apply the multi-indexing and unstacking so I can plot the yearly data on top of each other like this:

df_sort = df1_season.groupby(lambda x: (x.year, x.month, x.day, x.hour)).agg(lambda s: s[-1])
df_sort.index = pd.MultiIndex.from_tuples(df_sort.index, names=['Y','M','D','H'])
unstacked = df_sort.unstack('Y')

My new data-frame for the first two days of May looks like this:

In [2]: unstacked
Out [2]:

          temp        season        
Y        2013  2014    2013    2014
M D  H                             
5 1  2   24.2  22.3  Summer  Summer
     8   24.1  22.3  Summer  Summer
     14  24.3  23.2  Summer  Summer
     20  24.6  23.2  Summer  Summer
  2  2   24.2  22.5  Summer  Summer
     8   24.8  22.2  Summer  Summer
     14  24.9  22.4  Summer  Summer
     20  24.9  22.8  Summer  Summer

736 rows × 4 columns 

The index for the new data frame shown above now looks like this:

In [2]: unstacked.index.values[0:8]
Out [2]:

array([(5, 1, 2), (5, 1, 8), (5, 1, 14), (5, 1, 20), (5, 2, 2), (5, 2, 8), (5, 2, 14), 
       (5, 2, 20], dtype=object)

which doesn't produce a very nice plot with respect to the xticks (major and minor). If I can convert this multi-index back to a single pandas datetime index, using only the month, day and hour data, then the major/minor ticks will be plotted automagically the way I would like (I think). For example:

current solution:

xticks = (5, 1, 2), (5, 1, 8) … (5, 2, 20)

required solution:

xticks(major) = Day, Month (displayed as MAY 01, MAY 02 etc etc)
xticks(minor) = Hour (displayed as 02h 08h … 20h)
like image 640
roi3i3ie Avatar asked Oct 22 '14 10:10

roi3i3ie


People also ask

How do I get rid of MultiIndex in pandas?

To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly. We have used the Multiindex. from_tuples() is used to create indexes column-wise.

How do I change multiple index to columns in pandas?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.

What is MultiIndex pandas?

The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list of arrays (using MultiIndex.


1 Answers

Converting data back and forth in pandas gets messy very fast, as you seem to have experienced. My recommendation in general concerning pandas and indexing, is to never just set the index, but to copy it first. Make sure you have a column which contains the index, since pandas does not allow all operations on the index, and intense setting and resetting of the index can cause columns to dissapear.

TLDR; Don't convert the index back. Keep a copy.

like image 134
firelynx Avatar answered Oct 20 '22 15:10

firelynx