Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Optimal way to MultiIndex columns

Tags:

python

pandas

I start with the following DataFrame:

df_1 = DataFrame({
        "Cat1" : ["a", "b"],
        "Vals1" : [1,2] ,
        "Vals2" : [3,4]
    })
df

enter image description here

I want to get it to look like this:

enter image description here

And I can do it, with this code:

df_2 = (
    pd.melt(df_1, id_vars=["Cat1"])
    .T
)
df_2.columns = (
    pd.MultiIndex
        .from_tuples(
            list(zip(df_2.loc["Cat1", :] , df_2.loc["variable", :])) ,
            names=["Cat1", None]
        )
)
df_2 = (
    df_2
    .loc[["value"], :]
    .reset_index(drop=True)
    .sortlevel(0, axis=1)
)
df_2

But there are so many steps here that I feel code smell, or at least something vaguely not pandas-idiomatic, as if I'm missing the point of something in the API. Doing the equivalent for row-based indexes is just one step, for example, via set_index(). (Note that I am aware that the columns equivalent of set_index() is still an open issue). Is there a better, more official way to do this?

like image 913
sparc_spread Avatar asked Apr 04 '17 14:04

sparc_spread


People also ask

How do you make a MultiIndex column in pandas?

pandas MultiIndex to Columns Use pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.

How do I reorder MultiIndex columns in pandas?

To rearrange levels in MultiIndex, use the MultiIndex. reorder_levels() method in Pandas. Set the order of levels using the order parameter.


2 Answers

You can use stack(), to_frame(), and T for transpose.

df_1.set_index('Cat1').stack().to_frame().T


Cat1     a           b      
     Vals1 Vals2 Vals1 Vals2
0        1     3     2     4
like image 165
Scott Boston Avatar answered Oct 27 '22 21:10

Scott Boston


Think about it as a transposed dataframe. Here you go:

df.set_index('Cat1').unstack().swaplevel().sort_index().to_frame().T
Out[46]: 
Cat1     a           b      
     Vals1 Vals2 Vals1 Vals2
0        1     3     2     4
like image 38
Zeugma Avatar answered Oct 27 '22 21:10

Zeugma