Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reorder Multiindex Pandas Dataframe

I would like to reorder the columns in a dataframe, and keep the underlying values in the right columns.

For example this is the dataframe I have

cols = [ ['Three', 'Two'],['A', 'D', 'C', 'B']]
header = pd.MultiIndex.from_product(cols)
df = pd.DataFrame([[1,4,3,2,5,8,7,6]]*4,index=np.arange(1,5),columns=header)                  
df.loc[:,('One','E')] = 9
df.loc[:,('One','F')] = 10

>>> df

And I would like to change it as follows:

header2 = pd.MultiIndex(levels=[['One', 'Two', 'Three'], ['E', 'F', 'A', 'B', 'C', 'D']],
       labels=[[0, 0, 0, 0, 1, 1, 1, 1, 2, 2], [0, 1, 2, 3, 4, 5, 2, 3, 4, 5]])

df2 = pd.DataFrame([[9,10,1,2,3,4,5,6,7,8]]*4,index=np.arange(1,5), columns=header2)
>>>>df2
like image 398
Jelmerd Avatar asked Aug 30 '18 20:08

Jelmerd


1 Answers

First, define a categorical ordering on the top level. Then, call sort_index on the first axis with both levels.

v = pd.Categorical(df.columns.get_level_values(0), 
                   categories=['One', 'Two', 'Three'], 
                   ordered=True)
v2 = pd.Categorical(df.columns.get_level_values(1), 
                    categories=['E', 'F', 'C', 'B', 'A', 'D'],
                    ordered=True)
df.columns = pd.MultiIndex.from_arrays([v, v2]) 

df = df.sort_index(axis=1, level=[0, 1])

df
  One     Two          Three         
    E   F   C  B  A  D     C  B  A  D
1   9  10   7  6  5  8     3  2  1  4
2   9  10   7  6  5  8     3  2  1  4
3   9  10   7  6  5  8     3  2  1  4
4   9  10   7  6  5  8     3  2  1  4
like image 168
cs95 Avatar answered Oct 01 '22 08:10

cs95