Pandas: Multilevel column names

pandas has support for multi-level column names:

>>>  x = pd.DataFrame({'instance':['first','first','first'],'foo':['a','b','c'],'bar':rand(3)}) >>> x = x.set_index(['instance','foo']).transpose() >>> x.columns MultiIndex [(u'first', u'a'), (u'first', u'b'), (u'first', u'c')] >>> x instance     first                     foo              a         b         c bar       0.102885  0.937838  0.907467

This feature is very useful since it allows multiple versions of the same dataframe to be appended 'horizontally' with the 1st level of the column names (in my example instance) distinguishing the instances.

Imagine I already have a dataframe like this:

                 a         b         c bar       0.102885  0.937838  0.907467

Is there a nice way to add another level to the column names, similar to this for row index:

x['instance'] = 'first' x.set_level('instance',append=True)

How does pandas handle multiple index columns?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.

How do you use MultiIndex in pandas?

Creating a MultiIndex (hierarchical index) object A MultiIndex can be created from a list of arrays (using MultiIndex. from_arrays() ), an array of tuples (using MultiIndex. from_tuples() ), a crossed set of iterables (using MultiIndex. from_product() ), or a DataFrame (using MultiIndex.

Try this:

df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]})  columns=[('c','a'),('c','b')]  df.columns=pd.MultiIndex.from_tuples(columns)

You can use concat. Give it a dictionary of dataframes where the key is the new column level you want to add.

In [46]: d = {}  In [47]: d['first_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],                                          data=[[10, 0.89, 0.98, 0.31],                                                [20, 0.34, 0.78, 0.34]]).set_index('idx')  In [48]: pd.concat(d, axis=1) Out[48]:     first_level               a     b     c idx 10         0.89  0.98  0.31 20         0.34  0.78  0.34

You can use the same technique to create multiple levels.

In [49]: d['second_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'],                                           data=[[10, 0.29, 0.63, 0.99],                                                 [20, 0.23, 0.26, 0.98]]).set_index('idx')  In [50]: pd.concat(d, axis=1) Out[50]:     first_level             second_level               a     b     c            a     b     c idx 10         0.89  0.98  0.31         0.29  0.63  0.99 20         0.34  0.78  0.34         0.23  0.26  0.98

Pandas: Multilevel column names

Tags:

python

pandas

LondonRob

People also ask

2 Answers

user3377361

Carl

Recent Activity

Donate For Us

Pandas: Multilevel column names

Tags:

python

pandas

LondonRob

People also ask

2 Answers

user3377361

Carl

Related questions

Recent Activity

Donate For Us