Given a dictionary of data frames like:
dict = {'ABC': df1, 'XYZ' : df2} # of any length...
where each data frame has the same columns and similar index, for example:
data Open High Low Close Volume Date 2002-01-17 0.18077 0.18800 0.16993 0.18439 1720833 2002-01-18 0.18439 0.21331 0.18077 0.19523 2027866 2002-01-21 0.19523 0.20970 0.19162 0.20608 771149
What is the simplest way to combine all the data frames into one, with a multi-index like:
symbol ABC XYZ data Open High Low Close Volume Open ... Date 2002-01-17 0.18077 0.18800 0.16993 0.18439 1720833 ... 2002-01-18 0.18439 0.21331 0.18077 0.19523 2027866 ... 2002-01-21 0.19523 0.20970 0.19162 0.20608 771149 ...
I've tried a few methods - eg for each data frame replace the columns with a multi-index like .from_product(['ABC', columns])
and then concatenate along axis=1
, without success.
A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.
pd. concat joins on the index and can join two or more DataFrames at once. It does a full outer join by default. For more information on concat , see this post.
Creating a MultiIndex (hierarchical index) object A MultiIndex can be created from a list of arrays (using MultiIndex. from_arrays() ), an array of tuples (using MultiIndex. from_tuples() ), a crossed set of iterables (using MultiIndex. from_product() ), or a DataFrame (using MultiIndex.
You can do it with concat
(the keys
argument will create the hierarchical columns index):
d = {'ABC' : df1, 'XYZ' : df2} print pd.concat(d.values(), axis=1, keys=d.keys()) XYZ ABC \ Open High Low Close Volume Open High Date 2002-01-17 0.18077 0.18800 0.16993 0.18439 1720833 0.18077 0.18800 2002-01-18 0.18439 0.21331 0.18077 0.19523 2027866 0.18439 0.21331 2002-01-21 0.19523 0.20970 0.19162 0.20608 771149 0.19523 0.20970 Low Close Volume Date 2002-01-17 0.16993 0.18439 1720833 2002-01-18 0.18077 0.19523 2027866 2002-01-21 0.19162 0.20608 771149
Really concat
wants lists so the following is equivalent:
print(pd.concat([df1, df2], axis=1, keys=['ABC', 'XYZ']))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With