Example One:
Notice the index order of the given Pandas DataFrame df
:
>>> df
A B
first second
zzz z 2 4
a 1 5
aaa z 6 3
a 7 8
After using the stack
and unstack
methods on the given df
DataFrame object, the index is automatically sorted lexicographically (alphabetically) so that one loses the original order of the rows.
>>> df.unstack().stack()
A B
first second
aaa a 7 8
z 6 3
zzz a 1 5
z 2 4
Is it possible to maintain the original ordering after the unstack/stack
operations above?
According to official documentation reshaping-by-stacking-and-unstacking:
Notice that the stack and unstack methods implicitly sort the index levels involved. Hence a call to stack and then unstack, or viceversa, will result in a sorted copy of the original DataFrame or Series
Example Two:
>>> dfu = df.unstack()
>>> dfu
A Z
second a z a z
first
aaa 7 6 8 3
zzz 1 2 5 4
If the original index is perserved we need dfu
like so:
>>> dfu
A Z
second a z a z
first
zzz 1 2 5 4
aaa 7 6 8 3
What I'm looking for is a solution that could be used to revert the index order based on the original dataframe after an unstack()
or stack()
method has been called.
Suppose we want to change the order of the index of series, then we have to use the Series. reindex() Method of pandas module for performing this task.
Pandas provides various built-in methods for reshaping DataFrame. Among them, stack() and unstack() are the 2 most popular methods for restructuring columns and rows (also known as index). stack() : stack the prescribed level(s) from column to row. unstack() : unstack the prescribed level(s) from row to column.
Using the stack() function will reshape the dataframe by converting the data into a stacked form. Since we are having multiple indices, that means converting (also called rotating or pivoting) the innermost column index into the innermost row index.
Pivot a level of the (necessarily hierarchical) index labels. Returns a DataFrame having a new level of column labels whose inner-most level consists of the pivoted index labels. If the index is not a MultiIndex, the output will be a Series (the analogue of stack when the columns are not a MultiIndex).
You can keep a copy of the original index
and reindex to that, thanks Andy Hayden.
Demo:
# A B
#first second
#zzz z 2 4
# a 1 5
#aaa z 6 3
# a 7 8
print df.index
#MultiIndex(levels=[[u'aaa', u'zzz'], [u'a', u'z']],
# labels=[[1, 1, 0, 0], [1, 0, 1, 0]],
# names=[u'first', u'second'])
#set index to variable
index = df.index
#stack and unstack
df = df.unstack().stack()
print df
# A B
#first second
#aaa a 7 8
# z 6 3
#zzz a 1 5
# z 2 4
# A B
df = df.reindex(index)
print df
# A B
#first second
#zzz z 2 4
# a 1 5
#aaa z 6 3
# a 7 8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With