I am trying to join two data.tables in R base don multiple setkeys and which have repeated entries. As an example
>DT1
ID state Month Day Year
1 IL Jan 3 2013
1 IL Jan 3 2014
1 IL Jan 3 2014
1 IL Jan 10 2014
1 IL Jan 11 2013
1 IL Jan 30 2013
1 IL Jan 30 2013
1 IL Feb 2 2013
1 IL Feb 2 2014
1 IL Feb 3 2013
1 IL Feb 3 2014
>DT2
state Month Day Year Tavg
IL Jan 1 2013 13
IL Jan 2 2013 19
IL Jan 3 2013 22
IL Jan 4 2013 23
IL Jan 5 2013 26
IL Jan 6 2013 24
IL Jan 7 2013 27
IL Jan 8 2013 32
IL Jan 9 2013 36
... ... .. ... ...
... ... .. ... ...
IL Dec 31 2013 33
I would like to add the "Tavg" values of DT2 to the corresponding dates in DT1 For example, all entries in DT1 that are on Jan 3 2013 need to have Tavg 13 in an additional column.
I tried the following
setkey(DT1, state, Month, Day, Year)
and same for DT2 followed by a Join operation
DT1[DT2, nomatch=0, allow.cartesian=TRUE
But it didn't work
To join two data frames (datasets) vertically, use the rbind function. The two data frames must have the same variables, but they do not have to be in the same order. If data frameA has variables that data frameB does not, then either: Delete the extra variables in data frameA or.
You can pass two DataFrame to be merged to the pandas. merge() method. This collects all common columns in both DataFrames and replaces each common column in both DataFrame with a single one.
Full join: The full outer join returns all of the records in a new table, whether it matches on either the left or right tables. If the table rows match, then a join will be executed, otherwise it will return NULL in places where a matching row does not exist.
Just helped a friend with this (he couldn't find a good Stack Overflow answer) so I figured this question needed a more complete "toy" answer.
Here's a couple of simple data tables with one mismatched key:
dt1 <- data.table(a = LETTERS[1:5],b=letters[1:5],c=1:5)
dt2 <- data.table(c = LETTERS[c(1:4,6)],b=letters[1:5],a=6:10)
And here's several multiple key merge options:
merge(dt1,dt2,by.x=c("a","b"),by.y=c("c","b")) #Inner Join
merge(dt1,dt2,by.x=c("a","b"),by.y=c("c","b"),all=T) #Outer Join
setkey(dt1,a,b)
setkey(dt2,c,b)
dt2[dt1] #Left Join (if dt1 is the "left" table)
dt1[dt2] #Right Join (if dt1 is the "left" table)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With