Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate Pandas columns under new multi-index level

Given a dictionary of data frames like:

dict = {'ABC': df1, 'XYZ' : df2}   # of any length... 

where each data frame has the same columns and similar index, for example:

data           Open     High      Low    Close   Volume Date                                                    2002-01-17  0.18077  0.18800  0.16993  0.18439  1720833 2002-01-18  0.18439  0.21331  0.18077  0.19523  2027866 2002-01-21  0.19523  0.20970  0.19162  0.20608   771149 

What is the simplest way to combine all the data frames into one, with a multi-index like:

symbol         ABC                                       XYZ data           Open     High      Low    Close   Volume  Open ... Date                                                    2002-01-17  0.18077  0.18800  0.16993  0.18439  1720833  ... 2002-01-18  0.18439  0.21331  0.18077  0.19523  2027866  ... 2002-01-21  0.19523  0.20970  0.19162  0.20608   771149  ... 

I've tried a few methods - eg for each data frame replace the columns with a multi-index like .from_product(['ABC', columns]) and then concatenate along axis=1, without success.

like image 392
Zero Avatar asked May 12 '14 03:05

Zero


People also ask

How does pandas handle multiple index columns?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.

Does PD concat join on index?

pd. concat joins on the index and can join two or more DataFrames at once. It does a full outer join by default. For more information on concat , see this post.

How do you create a multilevel index in pandas?

Creating a MultiIndex (hierarchical index) object A MultiIndex can be created from a list of arrays (using MultiIndex. from_arrays() ), an array of tuples (using MultiIndex. from_tuples() ), a crossed set of iterables (using MultiIndex. from_product() ), or a DataFrame (using MultiIndex.


1 Answers

You can do it with concat (the keys argument will create the hierarchical columns index):

d = {'ABC' : df1, 'XYZ' : df2} print pd.concat(d.values(), axis=1, keys=d.keys())                   XYZ                                          ABC           \                Open     High      Low    Close   Volume     Open     High    Date                                                                         2002-01-17  0.18077  0.18800  0.16993  0.18439  1720833  0.18077  0.18800    2002-01-18  0.18439  0.21331  0.18077  0.19523  2027866  0.18439  0.21331    2002-01-21  0.19523  0.20970  0.19162  0.20608   771149  0.19523  0.20970                      Low    Close   Volume   Date                                    2002-01-17  0.16993  0.18439  1720833   2002-01-18  0.18077  0.19523  2027866   2002-01-21  0.19162  0.20608   771149 

Really concat wants lists so the following is equivalent:

print(pd.concat([df1, df2], axis=1, keys=['ABC', 'XYZ'])) 
like image 186
Karl D. Avatar answered Sep 22 '22 18:09

Karl D.