Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine Pandas DataFrames while creating MultiIndex Columns

I have two DataFrames, something like this:

import pandas as pd

dates = pd.Index(['2016-10-03', '2016-10-04', '2016-10-05'], name='Date')

close = pd.DataFrame( {'AAPL': [112.52,  113., 113.05],
                       'CSCO': [  31.5, 31.35, 31.59 ],
                       'MSFT': [ 57.42, 57.24, 57.64 ] }, index = dates )

volume= pd.DataFrame( {'AAPL': [21701800, 29736800, 21453100] ,
                       'CSCO': [14070500, 18460400, 11808600] ,
                       'MSFT': [19189500, 20085900, 16726400] }, index = dates )

The output of DataFrame 'close' looks like this:

              AAPL   CSCO   MSFT
Date
2016-10-03  112.52  31.50  57.42
2016-10-04  113.00  31.35  57.24
2016-10-05  113.05  31.59  57.64

And the output of DataFrame 'volume' looks like this:

                AAPL      CSCO      MSFT
Date
2016-10-03  21701800  14070500  19189500
2016-10-04  29736800  18460400  20085900
2016-10-05  21453100  11808600  16726400

I would like to combine these two DataFrames into a single DataFrame with MultiIndex COLUMNS so that it looks like this:

              AAPL                CSCO                MSFT
             Close     Volume    Close     Volume    Close     Volume  
Date
2016-10-03  112.52   21701800    31.50   14070500    57.42   19189500
2016-10-04  113.00   29736800    31.35   18460400    57.24   20085900
2016-10-05  113.05   21453100    31.59   11808600    57.64   16726400

Can anyone give me an idea how to do that? I've been playing with pd.concat and pd.merge, but it's not clear to me how to get it to line up on the date index and allow me to provide names for the sub-index ('Close" and 'Volume') on the columns.

like image 877
Daniel Goldfarb Avatar asked Jan 01 '23 02:01

Daniel Goldfarb


1 Answers

You can use the keys kwarg of concat:

In [11]: res = pd.concat([close, volume], axis=1, keys=["close", "volume"])

In [12]: res
Out[12]:
             close                  volume
              AAPL   CSCO   MSFT      AAPL      CSCO      MSFT
Date
2016-10-03  112.52  31.50  57.42  21701800  14070500  19189500
2016-10-04  113.00  31.35  57.24  29736800  18460400  20085900
2016-10-05  113.05  31.59  57.64  21453100  11808600  16726400

With a little rearrangement:

In [13]: res.swaplevel(0, 1, axis=1).sort_index(axis=1)
Out[13]:
              AAPL             CSCO             MSFT
             close    volume  close    volume  close    volume
Date
2016-10-03  112.52  21701800  31.50  14070500  57.42  19189500
2016-10-04  113.00  29736800  31.35  18460400  57.24  20085900
2016-10-05  113.05  21453100  31.59  11808600  57.64  16726400
like image 98
Andy Hayden Avatar answered Jan 24 '23 21:01

Andy Hayden