Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I key by a column of lists in data.table

Tags:

r

data.table

Following this post, I have another question about columns of lists in data.table.

DT = data.table(x=list(c(1,2),c(1,2),c(3,4,5)))

It seems you can't key on a column of lists.

DT[,y:=.I,by=x]
Erreur dans `[.data.table`(DT, , `:=`(y, .I), by = x) :
  The items in the 'by' or 'keyby' list are length (2,2,3). Each must be same length as rows in x or number of rows returned by i (3).

I thought I could with lists of same length but:

DT = data.table(x=list(c(1,2),c(1,2),c(3,5)))
DT[,y:=.I,by=x]
Erreur dans `[.data.table`(DT, , `:=`(y, .I), by = x) :
  The items in the 'by' or 'keyby' list are length (2,2,2). Each must be same length as rows in x or number of rows returned by i (3).

Is there a workaround? If not what about a feature request?

like image 245
statquant Avatar asked Oct 22 '22 17:10

statquant


1 Answers

I'd do something like this as a workaround:

DT[, y := which(DT$x %in% x), by = 1:nrow(DT)]

This returns the first matching index always, which will serve as a group id.

You should do something like this:

DT[, psnInGrp := seq_along(x), by=y]

#        x y psnInGrp
# 1:   1,2 1        1
# 2:   1,2 1        2
# 3: 3,4,5 3        1
like image 54
Arun Avatar answered Oct 24 '22 11:10

Arun