Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr - join two dataframes by date range condition

I'm trying to join two df by two conditions in R, the conditions are

  • Same country code
  • date between a range of dates

DF1

CountryCode EventDate
NL 2023-02-17
DE 2022-08-17
CountryCode FreightCost ValidFrom ValidUntil
NL 10 2021-07-01 2022-07-31
NL 12 2021-08-01 2025-12-31
DE 8 2021-08-01 2022-12-31
DE 11 2023-01-01 2025-12-31

I'm having issues using merge or a join function from dplyr

The desire output I would like to have is

CountryCode EventDate FreightCost
NL 2023-02-17 12
DE 2022-08-17 8

I have tried this, but unable to find a solution

DF_final <- df1 %>% 
  left_join(
    df2 ,
    by = join_by(
      CountryCode= CountryCode,
      between(EventDate,
              ValidFrom,
              ValidUntil)
    )
)

Thanks for any help!

like image 689
learnct Avatar asked Jun 14 '26 06:06

learnct


1 Answers

The error comes from your first variable in by. You should either use == or state it once since they have the same name in both data sets:

df1 %>% 
  left_join(
    df2,
    by = join_by(
      CountryCode,
      between(EventDate,
              ValidFrom,
              ValidUntil)
    )
  )
like image 119
Maël Avatar answered Jun 17 '26 11:06

Maël



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!