Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split one level of an multi index into columns

Tags:

python

pandas

So I have a data frame:

df = pd.DataFrame([["foo","fizz",1],["foo","fizz",2],["foo","buzz",3],["foo","buzz",4],["bar","fizz",6],["bar","buzz",8]],columns=["a","b","c"])

       a    b     c
    0  foo  fizz  1
    1  foo  fizz  2
    2  foo  buzz  3
    3  foo  buzz  4
    4  bar  fizz  6
    5  bar  buzz  8

That I can group:

df2 = df.groupby(["a","b"]).sum()

              c
    a   b
    bar buzz  8
        fizz  6
    foo buzz  7
        fizz  3

Which is awesome! But what I really need, instead of the "c" column is two columns, "foo" and "bar":

          foo  bar
    b
    buzz  7    8
    fizz  3    6

Can someone suggest a way to do this? I tried searching, but I guess I don't have the correct terminology for this so I couldn't find anything.

like image 231
binnisb Avatar asked Feb 16 '16 08:02

binnisb


People also ask

How do I convert multiple indexes to columns?

We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.

How do I convert multiple index to columns in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.

How do I get rid of Multi-Level index?

To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly. We have used the Multiindex. from_tuples() is used to create indexes column-wise.


1 Answers

You could use unstack for that:

df2.unstack(level='a')

Example:

In [146]: df2.unstack(level='a')
Out[146]:
       c
a    bar foo
b
buzz   8   7
fizz   6   3

After that you'll get multiindexed columns. If you need to get flat dataframe you could use droplevel of multiindex:

df3 = df2.unstack(level='a')
df3.columns = df3.columns.droplevel()

In [177]: df3
Out[177]:
a     bar  foo
b
buzz    8    7
fizz    6    3

EDIT

droplevel drops level from MultiIndex which your columns become after unstack. By default it drops level 0 which is what you need for that dataframe.

Copy from help(pd.core.index.MultiIndex.droplevel):

Help on function droplevel in module pandas.core.index:

droplevel(self, level=0) Return Index with requested level removed. If MultiIndex has only 2 levels, the result will be of Index type not MultiIndex.

Parameters
----------
level : int/level name or list thereof

Notes
-----
Does not check if result index is unique or not

Returns
-------
index : Index or MultiIndex
like image 153
Anton Protopopov Avatar answered Sep 20 '22 03:09

Anton Protopopov