I have two data frames:
df1
col1 col2
8 A
12 C
20 D
df2
col1 col3
7 F
15 G
I want to merge these two data frames on col1 in a way that the closest value of col1 from df2 and df1 will merge in a single row.
the final data frame will look like,
df
col1 col2 col3
8 A F
12 C G
20 D NA
I can do this using for loop and comparing the numbers, but the execution time will be huge.
Is there any pythonic way to do it, so the runtime will be reduced. Some pandas shortcut may be.
The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.
concat() to Merge Two DataFrames by Index. You can concatenate two DataFrames by using pandas. concat() method by setting axis=1 , and by default, pd. concat is a row-wise outer join.
Pandas DataFrame merge() function is used to merge two DataFrame objects with a database-style join operation. The joining is performed on columns or indexes. If the joining is done on columns, indexes are ignored. This function returns a new DataFrame and the source DataFrame objects are unchanged.
We'll pass two dataframes to pd. concat() method in the form of a list and mention in which axis you want to concat, i.e. axis=0 to concat along rows, axis=1 to concat along columns.
Use merge_asof
with direction='nearest'
and tolerance parameter:
df = pd.merge_asof(df1, df2, on='col1', direction='nearest', tolerance=3)
print (df)
col1 col2 col3
0 8 A F
1 12 C G
2 20 D NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With