I'm trying to convert an unstacked, multi-indexed data-frame back to a single pandas datetime index.
The index of my original data-frame, i.e. before multi-indexing and unstacking, looks like this:
In [1]: df1_season.index
Out [1]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-01 02:00:00, ..., 2014-07-31 23:00:00]
Length: 1472, Freq: None, Timezone: None
then I apply the multi-indexing and unstacking so I can plot the yearly data on top of each other like this:
df_sort = df1_season.groupby(lambda x: (x.year, x.month, x.day, x.hour)).agg(lambda s: s[-1])
df_sort.index = pd.MultiIndex.from_tuples(df_sort.index, names=['Y','M','D','H'])
unstacked = df_sort.unstack('Y')
My new data-frame for the first two days of May looks like this:
In [2]: unstacked
Out [2]:
temp season
Y 2013 2014 2013 2014
M D H
5 1 2 24.2 22.3 Summer Summer
8 24.1 22.3 Summer Summer
14 24.3 23.2 Summer Summer
20 24.6 23.2 Summer Summer
2 2 24.2 22.5 Summer Summer
8 24.8 22.2 Summer Summer
14 24.9 22.4 Summer Summer
20 24.9 22.8 Summer Summer
736 rows × 4 columns
The index for the new data frame shown above now looks like this:
In [2]: unstacked.index.values[0:8]
Out [2]:
array([(5, 1, 2), (5, 1, 8), (5, 1, 14), (5, 1, 20), (5, 2, 2), (5, 2, 8), (5, 2, 14),
(5, 2, 20], dtype=object)
which doesn't produce a very nice plot with respect to the xticks (major and minor). If I can convert this multi-index back to a single pandas datetime index, using only the month, day and hour data, then the major/minor ticks will be plotted automagically the way I would like (I think). For example:
current solution:
xticks = (5, 1, 2), (5, 1, 8) … (5, 2, 20)
required solution:
xticks(major) = Day, Month (displayed as MAY 01, MAY 02 etc etc)
xticks(minor) = Hour (displayed as 02h 08h … 20h)
To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly. We have used the Multiindex. from_tuples() is used to create indexes column-wise.
A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.
The MultiIndex object is the hierarchical analogue of the standard Index object which typically stores the axis labels in pandas objects. You can think of MultiIndex as an array of tuples where each tuple is unique. A MultiIndex can be created from a list of arrays (using MultiIndex.
Converting data back and forth in pandas gets messy very fast, as you seem to have experienced. My recommendation in general concerning pandas and indexing, is to never just set the index, but to copy it first. Make sure you have a column which contains the index, since pandas does not allow all operations on the index, and intense setting and resetting of the index can cause columns to dissapear.
TLDR; Don't convert the index back. Keep a copy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With