library(data.table)
DT1 <- data.table(num = 1:6, group = c("A", "B", "B", "B", "A", "C"))
DT2 <- data.table(group = c("A", "B", "C"))
I want to add a column popular
to DT2
with value TRUE
whenever DT2$group
is contained in DT1$group
at least twice. So, in the example above, DT2
should be
group popular
1: A TRUE
2: B TRUE
3: C FALSE
What would be an efficient way to get to this?
Updated example: DT2
may actually contain more groups than DT1
, so here's an updated example:
DT1 <- data.table(num = 1:6, group = c("A", "B", "B", "B", "A", "C"))
DT2 <- data.table(group = c("A", "B", "C", "D"))
And the desired output would be
group popular
1: A TRUE
2: B TRUE
3: C FALSE
4: D FALSE
I'd just do it this way:
## 1.9.4+
setkey(DT1, group)
DT1[J(DT2$group), list(popular = .N >= 2L), by = .EACHI]
# group popular
# 1: A TRUE
# 2: B TRUE
# 3: C FALSE
# 4: D FALSE ## on the updated example
data.table
's join syntax is quite powerful, in that, while joining, you can also aggregate / select / update columns in j
. Here we perform a join. For each row in DT2$group
, on the corresponding matching rows in DT1
, we compute the j
-expression .N >= 2L
; by specifying by = .EACHI
(please check 1.9.4 NEWS), we compute the j
-expression each time.
In 1.9.4
, .()
has been introduced as an alias in all i
, j
and by
. So you could also do:
DT1[.(DT2$group), .(popular = .N >= 2L), by = .EACHI]
When you're joining by a single character column, you can drop the .()
/ J()
syntax altogether (for convenience). So this can be also written as:
DT1[DT2$group, .(popular = .N >= 2L), by = .EACHI]
This is how I would do it: first count the number of times each group appears in DT1
, then simply join DT2
and DT1
.
require(data.table)
DT1 <- data.table(num = 1:6, group = c("A", "B", "B", "B", "A", "C"))
DT2 <- data.table(group = c("A", "B", "C"))
#solution:
DT1[,num_counts:=.N,by=group] #the number of entries in this group, just count the other column
setkey(DT1, group)
setkey(DT2, group)
DT2 = DT1[DT2,mult="last"][,list(group, popular = (num_counts >= 2))]
#> DT2
# group popular
#1: A TRUE
#2: B TRUE
#3: C FALSE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With