I have two DataFrame's having different columns, but I would like to merge them by aligning them on rows. That is, say I have those two dataFrames
df1 = pd.DataFrame(np.arange(12).reshape(6, 2), index=np.arange(6)*0.1, columns=['a', 'b'])
df1
a b
0.0 0 1
0.1 2 3
0.2 4 5
0.3 6 7
0.4 8 9
0.5 10 11
df2 = pd.DataFrame(np.arange(8).reshape(4, 2), index=[0.07, 0.21, 0.43, 0.54], columns=['c', 'd'])
df2
c d
0.07 0 1
0.21 2 3
0.43 4 5
0.54 6 7
I want to merge df2 with df1 such that the rows of df2 are aligned with the nearest neighbor index from `df1. The end result would be:
a b c d
0.0 0 1 NaN NaN
0.1 2 3 0 1
0.2 4 5 2 3
0.3 6 7 NaN NaN
0.4 8 9 4 5
0.5 10 11 6 7
I appreciate any ideas on how to tackle this efficiently.
Since you mention close
df2.index=[min(df1.index, key=lambda x:abs(x-y)) for y in df2.index]
pd.concat([df1,df2],1)
Out[535]:
a b c d
0.0 0 1 NaN NaN
0.1 2 3 0.0 1.0
0.2 4 5 2.0 3.0
0.3 6 7 NaN NaN
0.4 8 9 4.0 5.0
0.5 10 11 6.0 7.0
I would temporarily redefine df2's index to be the rounded versions of it's actual index:
merged = (
df2.assign(idx=np.round(df2.index, 1)) # compute the rounded index
.reset_index(drop=True) # drop the existing index
.set_index('idx') # new, rounded index
.join(df1, how='right') # right join
.sort_index(axis='columns') # sort the columns
)
And I get:
a b c d
0.0 0 1 NaN NaN
0.1 2 3 0.0 1.0
0.2 4 5 2.0 3.0
0.3 6 7 NaN NaN
0.4 8 9 4.0 5.0
0.5 10 11 6.0 7.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With