I've got two data.table
s, both of which share one variable; I'm trying to add a variable that's missing from the second, but which is tied one-for-one to the shared variable.
This is clearly a merge, but because the shared variable has multiple instances, I'm having to use what feels like a workaround to merge the new variable.
Let's get specific.
x <- data.table(let = rep(letters[1:3], 2:4),
num = rep(1:3, 2:4), other = rnorm(9))
y <- data.table(let = rep(c("a", "c"), c(10, 6)))
x:
let num other
1: a 1 -0.41695882
2: a 1 -0.59875888
3: b 2 -0.19433915
4: b 2 0.58406046
5: b 2 -0.33922321
6: c 3 -0.63076561
7: c 3 1.06987710
8: c 3 0.08869372
9: c 3 -1.31196123
y:
let
1: a
2: a
3: a
4: a
5: a
6: a
7: a
8: a
9: a
10: a
11: c
12: c
13: c
14: c
15: c
16: c
I just want to add the num
column to y
; since num
is matched 1-1 with let
, it doesn't really matter that there's duplicates.
Here's an approach that works; I've just got a feeling there's something simpler.
setkey(x, let)
setkey(y, let)
y <- x[!duplicated(let), c("let", "num"), with = FALSE][y]
The only improvements that I could think of is that
You could skip the setkey(x, let)
part
You could also update y
by reference (rather than creating a copy using <-
and then assigning back to y
)
If you are using the current stable version version of data.table
(v <= 1.9.4) you will have to use allow.cartesian = TRUE
setkey(y,let)
y[x[!duplicated(let)], num := i.num, allow.cartesian = TRUE][]
You could alternatively use unique
instead of duplicated
(they both have data.table
methods)
y[unique(x, by = "let"), num := i.num, allow.cartesian = TRUE]
Here's another possibility using the new .EACHI
method, although there's no need for the use of by=.EACHI
here. I've shown you just to expose this feature for you. Have a look at this post for a detailed explanation of what this does and when it's useful.
y[x, num := unique(i.num), by = .EACHI, allow.cartesian = TRUE]
Edit: (Thanks to @Arun for pointing this out)
We shouldn't need allow.cartesian
argument here, as there are no duplicates in i
. In fact, it's a bug, #742 that has been fixed in the current development version (1.9.5). So you just need to do:
y[x[!duplicated(let)], num := i.num, on = "let"]
# or
y[unique(x, by = "let"), num := i.num, on = "let"]
# or (though not recommended in this specific case)
y[x, num := unique(i.num), by = .EACHI, on = "let"]
Well, I would use a merge
like the following, but I am not sure that it is simpler than what you have already done.
merge(y, unique(x[, c('let','num')]), all.x=TRUE, by='let')
Agree with @David, difficult to get much simpler. But below trim a few key strokes away :-)
setkey(x,let)
y<-x[!duplicated(let),.(let,num)][y]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With