I'm trying to figure out how to join 2 data tables and update the first but with a filter applied.
DT<-data.table(a=rep(1:3,3),b=seq(1:9))
DT
a b
1: 1 1
2: 2 2
3: 3 3
4: 1 4
5: 2 5
6: 3 6
7: 1 7
8: 2 8
9: 3 9
DT2 <- data.table(b=seq(1:9), c=rep(10,9))
> DT2
b c
1: 1 10
2: 2 10
3: 3 10
4: 4 10
5: 5 10
6: 6 10
7: 7 10
8: 8 10
9: 9 10
I can do a basic equijoin like so
DT[DT2, on=c(b="b")]
But what I'd like to do logically is this
DT[a==3,DT2, on=c(b="b")]
but I get the following error
Error in `[.data.table`(DT, a == 3, DT2, on = c(b = "b")) :
logical error. i is not a data.table, but 'on' argument is provided.
I can reverse the order of the join and apply the filter...
DT2[DT[a==3,], on=c(b="b")]
b a
1: 3 3
2: 6 3
3: 9 3
Which gives the correct rows but the column order is incorrect. That aside I'd like to update DT with c but only for the rows I've filtered in DT and that satisfy the join.
If this was SQL I would use an update with a subquery like so:
UPDATE
DT
set
c = (select c from DT2 where DT2.b = DT.B)
WHERE
DT.a=3
I seem to be going in circles with the Data table syntax - can anyone point me in the right direction?
Cheers
David
Another option without having to make a dummy variable is:
DT[a==3, c := DT2[DT[a==3], c, on = c(b="b")]]
DT
# a b c
#1: 1 1 NA
#2: 2 2 NA
#3: 3 3 10
#4: 1 4 NA
#5: 2 5 NA
#6: 3 6 10
#7: 1 7 NA
#8: 2 8 NA
#9: 3 9 10
You can create a dummy
variable a
in DT2
, join on both columns a and b and then Update:
DT[DT2[, c(a = 3, .SD)], c := i.c, on = c("a", "b")]
DT
# a b c
#1: 1 1 NA
#2: 2 2 NA
#3: 3 3 10
#4: 1 4 NA
#5: 2 5 NA
#6: 3 6 10
#7: 1 7 NA
#8: 2 8 NA
#9: 3 9 10
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With