Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to combine single and multiindex Pandas DataFrames

Tags:

python

pandas

I am trying to concatenate multiple Pandas DataFrames, some of which use multi-indexing and others use single indices. As an example, let's consider the following single indexed dataframe:

> import pandas as pd
> df1 = pd.DataFrame({'single': [10,11,12]})
> df1

   single
0      10
1      11
2      12

Along with a multiindex dataframe:

> level_dict = {}
> level_dict[('level 1','a','h')] = [1,2,3]
> level_dict[('level 1','b','j')] = [5,6,7]
> level_dict[('level 2','c','k')] = [10, 11, 12]
> level_dict[('level 2','d','l')] = [20, 21, 22]
> df2 = pd.DataFrame(level_dict)
> df2

  level 1    level 2    
        a  b       c   d
        h  j       k   l
0       1  5      10  20
1       2  6      11  21
2       3  7      12  22

Now I wish to concatenate the two dataframes. When I try to use concat it flattens the multiindex as follows:

> df3 = pd.concat([df2,df1], axis=1)
> df3

   (level 1, a, h)  (level 1, b, j)  (level 2, c, k)  (level 2, d, l)       single
0                1                5               10               20          10
1                2                6               11               21          11
2                3                7               12               22          12

If instead I append a single column to the multiindex dataframe df2 as follows:

> df2['single'] = [10,11,12]
> df2

  level 1    level 2     single
        a  b       c   d       
        h  j       k   l       
0       1  5      10  20     10
1       2  6      11  21     11
2       3  7      12  22     12

How can I instead generate this dataframe from df1 and df2 with concat, merge, or join?

like image 730
wbinventor Avatar asked Sep 07 '15 22:09

wbinventor


People also ask

Can you combine two DataFrames in pandas?

Pandas provides a huge range of methods and functions to manipulate data, including merging DataFrames. Merging DataFrames allows you to both create a new DataFrame without modifying the original data source or alter the original data source.

How do I merge two objects in pandas?

Pandas DataFrame merge() function is used to merge two DataFrame objects with a database-style join operation. The joining is performed on columns or indexes. If the joining is done on columns, indexes are ignored. This function returns a new DataFrame and the source DataFrame objects are unchanged.


1 Answers

I don't think you can avoid converting the single index into a MultiIndex. This is probably the easiest way, you could also convert after joining.

In [48]: df1.columns = pd.MultiIndex.from_tuples([(c, '', '') for c in df1])

In [49]: pd.concat([df2, df1], axis=1)
Out[49]: 
  level 1    level 2     single
        a  b       c   d       
        h  j       k   l       
0       1  5      10  20     10
1       2  6      11  21     11
2       3  7      12  22     12
like image 113
chrisb Avatar answered Sep 28 '22 04:09

chrisb