pandas
has support for multi-level column names:
>>> x = pd.DataFrame({'instance':['first','first','first'],'foo':['a','b','c'],'bar':rand(3)}) >>> x = x.set_index(['instance','foo']).transpose() >>> x.columns MultiIndex [(u'first', u'a'), (u'first', u'b'), (u'first', u'c')] >>> x instance first foo a b c bar 0.102885 0.937838 0.907467
This feature is very useful since it allows multiple versions of the same dataframe to be appended 'horizontally' with the 1st level of the column names (in my example instance
) distinguishing the instances.
Imagine I already have a dataframe like this:
a b c bar 0.102885 0.937838 0.907467
Is there a nice way to add another level to the column names, similar to this for row index:
x['instance'] = 'first' x.set_level('instance',append=True)
A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.
Creating a MultiIndex (hierarchical index) object A MultiIndex can be created from a list of arrays (using MultiIndex. from_arrays() ), an array of tuples (using MultiIndex. from_tuples() ), a crossed set of iterables (using MultiIndex. from_product() ), or a DataFrame (using MultiIndex.
Try this:
df=pd.DataFrame({'a':[1,2,3],'b':[4,5,6]}) columns=[('c','a'),('c','b')] df.columns=pd.MultiIndex.from_tuples(columns)
You can use concat
. Give it a dictionary of dataframes where the key is the new column level you want to add.
In [46]: d = {} In [47]: d['first_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'], data=[[10, 0.89, 0.98, 0.31], [20, 0.34, 0.78, 0.34]]).set_index('idx') In [48]: pd.concat(d, axis=1) Out[48]: first_level a b c idx 10 0.89 0.98 0.31 20 0.34 0.78 0.34
You can use the same technique to create multiple levels.
In [49]: d['second_level'] = pd.DataFrame(columns=['idx', 'a', 'b', 'c'], data=[[10, 0.29, 0.63, 0.99], [20, 0.23, 0.26, 0.98]]).set_index('idx') In [50]: pd.concat(d, axis=1) Out[50]: first_level second_level a b c a b c idx 10 0.89 0.98 0.31 0.29 0.63 0.99 20 0.34 0.78 0.34 0.23 0.26 0.98
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With