Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert column suffixes from pandas join into a MultiIndex

Tags:

python

pandas

I have two pandas DataFrames with (not necessarily) identical index and column names.

>>> df_L = pd.DataFrame({'X': [1, 3], 
                         'Y': [5, 7]})

>>> df_R = pd.DataFrame({'X': [2, 4], 
                         'Y': [6, 8]})

I can join them together and assign suffixes.

>>> df_L.join(df_R, lsuffix='_L', rsuffix='_R')

    X_L Y_L X_R Y_R
0   1   5   2   6
1   3   7   4   8

But what I want is to make 'L' and 'R' sub-columns under both 'X' and 'Y'.

The desired DataFrame looks like this:

>>> pd.DataFrame(columns=pd.MultiIndex.from_product([['X', 'Y'], ['L', 'R']]), 
         data=[[1, 5, 2, 6],
               [3, 7, 4, 8]])

    X       Y
    L   R   L   R
0   1   5   2   6
1   3   7   4   8

Is there a way I can combine the two original DataFrames to get this desired DataFrame?

like image 346
Vermillion Avatar asked Nov 02 '18 20:11

Vermillion


People also ask

How do I join two DataFrames in pandas based on column?

We can join columns from two Dataframes using the merge() function. This is similar to the SQL 'join' functionality. A detailed discussion of different join types is given in the SQL lesson. You specify the type of join you want using the how parameter.

Is merge and join same in pandas?

Pandas Join vs Merge Differences The main difference between join vs merge would be; join() is used to combine two DataFrames on the index but not on columns whereas merge() is primarily used to specify the columns you wanted to join on, this also supports joining on indexes and combination of index and columns.


1 Answers

You can use pd.concat with the keys argument, along the first axis:

df = pd.concat([df_L, df_R], keys=['L','R'],axis=1).swaplevel(0,1,axis=1).sort_index(level=0, axis=1)

>>> df
   X     Y   
   L  R  L  R
0  1  2  5  6
1  3  4  7  8
like image 56
sacuL Avatar answered Sep 24 '22 15:09

sacuL