Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Join tables by date range [duplicate]

I am looking for simple method to join two tables by date range. 1 table contains exact date, another table contains two variables identifying beginning and ending of the time period. I need to join tables if date in first table is withing range from second table.

data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
                name = c('id1','id2','id3','id4'))


data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'), 
                ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
                class = c(1,2,3,4))

result <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
                 beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'), 
                 ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
                 name = c('id1','id2','id3','id4'),
                 class = c(1,2,3,4))

Any help please? I found few difficult examples but they don't even work on my data because of formats. I need something like:

select * from data1
left join
select * from data2
where data2.beginning <= data1.date <= data2.ending

Thanks

like image 521
Residium Avatar asked May 30 '14 16:05

Residium


People also ask

What are the 4 types of joins in SQL?

There are four main types of JOINs in SQL: INNER JOIN, OUTER JOIN, CROSS JOIN, and SELF JOIN.

What are the 5 different types of tables joins?

As known, there are five types of join operations: Inner, Left, Right, Full and Cross joins.

Can we apply join on 2 tables without any relation?

The answer to this question is yes, you can join two unrelated tables in SQL, and in fact, there are multiple ways to do this, particularly in the Microsoft SQL Server database. The most common way to join two unrelated tables is by using CROSS join, which produces a cartesian product of two tables.

Can we join more than 2 tables using join?

Join is a binary operation. More than two tables can be combined using multiple join operations. Understanding the join function is fundamental to understanding relational databases, which are made up of many tables.


1 Answers

I know the following looks horrible in base, but here's what I came up with. It's better to use the 'sqldf' package (see below).

library(data.table)
data1 <- data.table(date = c('2010-01-21', '2010-01-25', '2010-02-02', '2010-02-09'),
                    name = c('id1','id2','id3','id4'))


data2 <- data.table(beginning=c('2010-01-15', '2010-01-23', '2010-01-30', '2010-02-05'), 
                    ending = c('2010-01-22','2010-01-29','2010-02-04','2010-02-13'),
                    class = c(1,2,3,4))

result <- cbind(data1,"beginning"=sapply(1:nrow(data2),function(x) data2$beginning[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
            "ending"=sapply(1:nrow(data2),function(x) data2$ending[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]),
            "class"=sapply(1:nrow(data2),function(x) data2$class[data2$beginning[x]<data1$date & data2$ending[x]>data1$date]))

Using the package sqldf:

library(sqldf)
result = sqldf("select * from data1
                left join data2
                on data1.date between data2.beginning and data2.ending")

Using data.table this is simply

data1[data2, on = .(date >= beginning, date <= ending)]
like image 67
nfmcclure Avatar answered Nov 30 '22 19:11

nfmcclure