Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

keep both merging keys after pandas.merge_asof

I have found this nice function pandas.merge_asof. From the documentation

pandas.merge_asof(left, right, on=None, left_on=None, right_on=None)

Parameters: 

left : DataFrame
right : DataFrame
on : label

Field name to join on. Must be found in both DataFrames.
The data MUST be ordered. 
Furthermore this must be a numeric column,such as datetimelike, integer, or float. 
On or left_on/right_on must be given.

and it works as expected.

However, my merged dataframe keeps as columns on only the one that originally was in left. I would need to keep them both, so to have

   mydf=pandas.merge_asof(left, right, on='Time')

and mydf to contain both Time from left and right

Example data:

a=pd.DataFrame(data=pd.date_range('20100201', periods=100, freq='6h3min'),columns=['Time'])
b=pd.DataFrame(data=
                  pd.date_range('20100201', periods=24, freq='1h'),columns=['Time'])
b['val']=range(b.shape[0])
out=pd.merge_asof(a,b,on='Time',direction='forward',tolerance=pd.Timedelta('30min'))
like image 740
00__00__00 Avatar asked May 04 '18 06:05

00__00__00


People also ask

How do I keep index in merge pandas?

How to Keep index when using Pandas Merge. By default, Pandas merge creates a new integer index for the merged DataFrame. If we wanted to preserve the index from the first DataFrame as the index of the merged DataFrame, we can specify the index explicitly using . set_axis(df1.

Does pandas merge preserve order?

Answer. Yes. Order of the merged dataframes will effect the order of the rows and columns of the merged dataframe. When using the merge() method, it will preserve the order of the left keys.

What is an ASOF merge?

Perform an asof merge. This is similar to a left-join except that we match on nearest key rather than equal keys. Both DataFrames must be sorted by the key.

What is difference between joining and merging in pandas DataFrame?

Pandas Join vs Merge Differences The main difference between join vs merge would be; join() is used to combine two DataFrames on the index but not on columns whereas merge() is primarily used to specify the columns you wanted to join on, this also supports joining on indexes and combination of index and columns.


1 Answers

I think one possible solution is rename columns:

out = pd.merge_asof(a.rename(columns={'Time':'Time1'}), 
                    b.rename(columns={'Time':'Time2'}), 
                    left_on='Time1',
                    right_on='Time2',
                    direction='forward',
                    tolerance=pd.Timedelta('30min'))

print (out.head())
                Time1      Time2  val
0 2010-02-01 00:00:00 2010-02-01  0.0
1 2010-02-01 06:03:00        NaT  NaN
2 2010-02-01 12:06:00        NaT  NaN
3 2010-02-01 18:09:00        NaT  NaN
4 2010-02-02 00:12:00        NaT  NaN
like image 156
jezrael Avatar answered Sep 28 '22 12:09

jezrael