Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating DataFrame with Hierarchical Columns

Tags:

python

pandas

What is the easiest way to create a DataFrame with hierarchical columns?

I am currently creating a DataFrame from a dict of names -> Series using:

df = pd.DataFrame(data=serieses)

I would like to use the same columns names but add an additional level of hierarchy on the columns. For the time being I want the additional level to have the same value for columns, let's say "Estimates".

I am trying the following but that does not seem to work:

pd.DataFrame(data=serieses,columns=pd.MultiIndex.from_tuples([(x, "Estimates") for x in serieses.keys()]))

All I get is a DataFrame with all NaNs.

For example, what I am looking for is roughly:

l1               Estimates    
l2  one  two  one  two  one  two  one  two
r1   1    2    3    4    5    6    7    8
r2   1.1  2    3    4    5    6    71   8.2

where l1 and l2 are the labels for the MultiIndex

like image 798
Alex Rothberg Avatar asked Aug 01 '13 04:08

Alex Rothberg


2 Answers

This appears to work:

import pandas as pd

data = {'a': [1,2,3,4], 'b': [10,20,30,40],'c': [100,200,300,400]}

df = pd.concat({"Estimates": pd.DataFrame(data)}, axis=1, names=["l1", "l2"])

l1  Estimates         
l2          a   b    c
0           1  10  100
1           2  20  200
2           3  30  300
3           4  40  400
like image 147
Alex Rothberg Avatar answered Oct 21 '22 23:10

Alex Rothberg


I know the question is really old but for pandas version 0.19.1 one can use direct dict-initialization:

d = {('a','b'):[1,2,3,4], ('a','c'):[5,6,7,8]}
df = pd.DataFrame(d, index=['r1','r2','r3','r4'])
df.columns.names = ('l1','l2')
print df

l1  a   
l2  b  c
r1  1  5
r2  2  6
r3  3  7
r4  4  8
like image 38
DimG Avatar answered Oct 21 '22 23:10

DimG