I have two pandas DataFrames with (not necessarily) identical index and column names.
>>> df_L = pd.DataFrame({'X': [1, 3],
'Y': [5, 7]})
>>> df_R = pd.DataFrame({'X': [2, 4],
'Y': [6, 8]})
I can join them together and assign suffixes.
>>> df_L.join(df_R, lsuffix='_L', rsuffix='_R')
X_L Y_L X_R Y_R
0 1 5 2 6
1 3 7 4 8
But what I want is to make 'L' and 'R' sub-columns under both 'X' and 'Y'.
The desired DataFrame looks like this:
>>> pd.DataFrame(columns=pd.MultiIndex.from_product([['X', 'Y'], ['L', 'R']]),
data=[[1, 5, 2, 6],
[3, 7, 4, 8]])
X Y
L R L R
0 1 5 2 6
1 3 7 4 8
Is there a way I can combine the two original DataFrames to get this desired DataFrame?
We can join columns from two Dataframes using the merge() function. This is similar to the SQL 'join' functionality. A detailed discussion of different join types is given in the SQL lesson. You specify the type of join you want using the how parameter.
Pandas Join vs Merge Differences The main difference between join vs merge would be; join() is used to combine two DataFrames on the index but not on columns whereas merge() is primarily used to specify the columns you wanted to join on, this also supports joining on indexes and combination of index and columns.
You can use pd.concat
with the keys
argument, along the first axis:
df = pd.concat([df_L, df_R], keys=['L','R'],axis=1).swaplevel(0,1,axis=1).sort_index(level=0, axis=1)
>>> df
X Y
L R L R
0 1 2 5 6
1 3 4 7 8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With