Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order factor levels in order of appearance in data set

I have a survey in which a unique ID must be assigned to questions. Some questions appear multiple times. This means that there is an extra layer of questions. In the sample data below only the first layer is included.

Question: how do I assign a unique index by order of appearance? The solution provided here works alphabetically. I can order the factors, but this defeats the purpose of doing it in R [there are many questions to sort].

library(data.table)
dt = data.table(question = c("C", "C", "A", "B", "B", "D"), 
                value = c(10,20,30,40,20,30))

dt[, idx := as.numeric(as.factor(question))]

gives:

  question value idx
# 1:        C    10   3
# 2:        C    20   3
# 3:        A    30   1
# 4:        B    40   2
# 5:        B    20   2
# 6:        D    30   4

# but required is:
dt[, idx.required := c(1, 1, 2, 3, 3, 4)]
like image 941
Henk Avatar asked Jun 11 '14 07:06

Henk


1 Answers

I think the data.table way to do this will be

dt[, idx := .GRP, by = question]

##    question value idx
## 1:        C    10   1
## 2:        C    20   1
## 3:        A    30   2
## 4:        B    40   3
## 5:        B    20   3
## 6:        D    30   4
like image 89
David Arenburg Avatar answered Sep 19 '22 19:09

David Arenburg