Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Multilevel column names

Tags:

python

pandas

pandas has support for multi-level column names:

>>>  x = pd.DataFrame({'instance':['first','first','first'],'foo':['a','b','c'],'bar':rand(3)}) >>> x = x.set_index(['instance','foo']).transpose() >>> x.columns MultiIndex [(u'first', u'a'), (u'first', u'b'), (u'first', u'c')] >>> x instance     first                     foo              a         b         c bar       0.102885  0.937838  0.907467 

This feature is very useful since it allows multiple versions of the same dataframe to be appended 'horizontally' with the 1st level of the column names (in my example instance) distinguishing the instances.

Imagine I already have a dataframe like this:

                 a         b         c bar       0.102885  0.937838  0.907467 

Is there a nice way to add another level to the column names, similar to this for row index:

x['instance'] = 'first' x.set_level('instance',append=True) 
like image 556
LondonRob Avatar asked Jan 29 '14 22:01

LondonRob


People also ask

How does pandas handle multiple index columns?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.

How do you use MultiIndex in pandas?

Creating a MultiIndex (hierarchical index) object A MultiIndex can be created from a list of arrays (using MultiIndex. from_arrays() ), an array of tuples (using MultiIndex. from_tuples() ), a crossed set of iterables (using MultiIndex. from_product() ), or a DataFrame (using MultiIndex.


2 Answers

Try this:

df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})  columns=[('c','a'),('c','b')]  df.columns=pd.MultiIndex.from_tuples(columns) 
like image 154
user3377361 Avatar answered Sep 20 '22 03:09

user3377361


You can use concat. Give it a dictionary of dataframes where the key is the new column level you want to add.

In [46]: d = {}  In [47]: d['first_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],                                          data=[[10, 0.89, 0.98, 0.31],                                                [20, 0.34, 0.78, 0.34]]).set_index('idx')  In [48]: pd.concat(d, axis=1) Out[48]:     first_level               a     b     c idx 10         0.89  0.98  0.31 20         0.34  0.78  0.34 

You can use the same technique to create multiple levels.

In [49]: d['second_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],                                           data=[[10, 0.29, 0.63, 0.99],                                                 [20, 0.23, 0.26, 0.98]]).set_index('idx')  In [50]: pd.concat(d, axis=1) Out[50]:     first_level             second_level               a     b     c            a     b     c idx 10         0.89  0.98  0.31         0.29  0.63  0.99 20         0.34  0.78  0.34         0.23  0.26  0.98 
like image 20
Carl Avatar answered Sep 20 '22 03:09

Carl