After creating a key on a data.table:
set.seed(12345) DT <- data.table(x = sample(LETTERS[1:3], 10, replace = TRUE), y = sample(LETTERS[1:3], 10, replace = TRUE)) setkey(DT, x, y) DT # x y # [1,] A B # [2,] A B # [3,] B B # [4,] B B # [5,] C A # [6,] C A # [7,] C A # [8,] C A # [9,] C C # [10,] C C
I would like to get an integer vector giving for each row the corresponding "key index". I hope the expected output (column i
) below will help clarify what I mean:
# x y i # [1,] A B 1 # [2,] A B 1 # [3,] B B 2 # [4,] B B 2 # [5,] C A 3 # [6,] C A 3 # [7,] C A 3 # [8,] C A 3 # [9,] C C 4 # [10,] C C 4
I thought about using something like cumsum(!duplicated(DT[, key(DT), with = FALSE]))
but am hoping there is a better solution. I feel this vector could be part of the table's internal representation, and maybe there is a way to access it? Even if it is not the case, what would you suggest?
Update: From v1.8.3
, you can simply use the inbuilt special .GRP
:
DT[ , i := .GRP, by = key(DT)]
See history for older answers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With