I have 2 dataframe and I want to join them on the basis of month and year from a date without creating extra columns:
example :
df1 :
date_1 value_1
2017-1-15 20
2017-1-31 30
2016-2-15 20
df2:
date_2 value_2
2017-1-1 30
then get result as below i.e. merged on the basis of month and year.
date_1 value_1 value_2
2017-1-15 20 30
2017-1-31 30 30
2016-2-15 20
Here's a rudimentary but effective solution:
res = pd.merge(df1.assign(grouper=df1['date_1'].dt.to_period('M')),
df2.assign(grouper=df2['date_2'].dt.to_period('M')),
how='left', on='grouper')
print(res)
date_1 value_1 grouper date_2 value_2
0 2017-01-15 20 2017-01 2017-01-01 30.0
1 2017-01-31 30 2017-01 2017-01-01 30.0
2 2016-02-15 20 2016-02 NaT NaN
You can then remove unwanted columns:
res = res[['date_1', 'value_1', 'value_2']]
Here is another way using lambda functions:
pd.merge(df,df2, left_on=df['date_1'].apply(lambda x: (x.year, x.month)),
right_on=df2['date_2'].apply(lambda y: (y.year, y.month)),
how='outer')[['date_1','value_1','value_2']]
date_1 value_1 value_2
0 2017-01-15 20 30.0
1 2017-01-31 30 30.0
2 2016-02-15 20 NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With