Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Flatten multi-index pandas dataframe where column names become values

Tags:

python

pandas

Let us say I have the following dataframe:

import pandas as pd
df = pd.DataFrame(data={'Status' : ['green','green','red','blue','red','yellow','black'],
 'Group' : ['A','A','B','C','A','B','C'],
 'City' : ['Toronto','Montreal','Vancouver','Toronto','Edmonton','Winnipeg','Windsor'],
 'Sales' : [13,6,16,8,4,3,1]})
df.drop('Status',axis=1,inplace=True)
ndf = pd.pivot_table(df,values=['Sales'],index=['City'],columns=['Group'],fill_value=0,margins=False)

The result looks like this:

In [321]: ndf
Out[321]:
          Sales
Group         A   B  C
City
Edmonton      4   0  0
Montreal      6   0  0
Toronto      13   0  8
Vancouver     0  16  0
Windsor       0   0  1
Winnipeg      0   3  0

How can I flatten it such that it becomes a single level data frame but with a column specifying the group?

I.e., the result should be:

City    group   sales
Edmonton    A   4
Edmonton    B   0
Edmonton    C   0
Montreal    A   6
Montreal    B   0
Montreal    C   0
Toronto     A   13
Toronto     B   0
Toronto     C   8
Vancouver   A   0
Vancouver   B   16
Vancouver   C   0
Windsor     A   0
Windsor     B   0
Windsor     C   1
Winnipeg    A   0
Winnipeg    B   3
Winnipeg    C   0
like image 707
Dnaiel Avatar asked Sep 19 '17 17:09

Dnaiel


People also ask

How do I flatten a multi-level column in pandas?

Flatten columns: use get_level_values() Flatten columns: use to_flat_index() Flatten columns: join column labels. Flatten rows: flatten all levels.

How do I change multiple index to columns in pandas?

We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe. Step 1: Creating a multi-index dataframe.

What is the flatten method in pandas?

Index.flatten(order='C') Return a copy of the array collapsed into one dimension. Parameters : order : {'C', 'F', 'A'}, optional. Whether to flatten in C (row-major), Fortran (column-major) order, or preserve the C/Fortran ordering from a .


1 Answers

Use stack and reset_index

In [1260]: ndf.stack().reset_index()
Out[1260]:
         City Group  Sales
0    Edmonton     A      4
1    Edmonton     B      0
2    Edmonton     C      0
3    Montreal     A      6
4    Montreal     B      0
5    Montreal     C      0
6     Toronto     A     13
7     Toronto     B      0
8     Toronto     C      8
9   Vancouver     A      0
10  Vancouver     B     16
11  Vancouver     C      0
12    Windsor     A      0
13    Windsor     B      0
14    Windsor     C      1
15   Winnipeg     A      0
16   Winnipeg     B      3
17   Winnipeg     C      0
like image 178
Zero Avatar answered Oct 09 '22 09:10

Zero