Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table cross-join by three variables

I'm trying cross join a data.table by three variables (group, id, and date). The R code below accomplishes exactly what I want to do, i.e., each id within each group is expanded to include all of the dates_wanted. But is there a way to do the same thing more efficiently using the excellent data.table package?

library(data.table)

data <- data.table(
    group = c(rep("A", 10), rep("B", 10)),
    id    = c(rep("frank", 5), rep("tony", 5), rep("arthur", 5),  rep("edward", 5)),
    date  = seq(as.IDate("2020-01-01"), as.IDate("2020-01-20"), by = "day")
)

data

dates_wanted <- seq(as.IDate("2020-01-01"), as.IDate("2020-01-31"), by = "day")

names_A <- data[group == "A"][["id"]]

names_B <- data[group == "B"][["id"]]

names_A <- CJ(group = "A", id = names_A, date = dates_wanted, unique = TRUE)

names_B <- CJ(group = "B", id = names_B, date = dates_wanted, unique = TRUE)

alldates <- rbind(names_A, names_B)

alldates

data[alldates, on = .(group, id, date)]
like image 336
user1491868 Avatar asked Apr 27 '26 16:04

user1491868


1 Answers

You can also do this:

data[, .(date=dates_wanted), .(group,id)]

Output:

     group     id       date
  1:     A  frank 2020-01-01
  2:     A  frank 2020-01-02
  3:     A  frank 2020-01-03
  4:     A  frank 2020-01-04
  5:     A  frank 2020-01-05
 ---                        
120:     B edward 2020-01-27
121:     B edward 2020-01-28
122:     B edward 2020-01-29
123:     B edward 2020-01-30
124:     B edward 2020-01-31
like image 92
langtang Avatar answered Apr 29 '26 06:04

langtang