Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create an new column in data frame : index in group (not unique between groups)

I have a data frame with two columns: the first column contains the group to which each individual belongs, and the second the individual's ID. See below:

df <- data.frame( group=c('G1','G1','G1','G1','G2','G2','G2','G2'), 
      indiv=c('indiv1','indiv1','indiv2','indiv2','indiv3',
              'indiv3','indiv4','indiv4'))

   group   indiv
1     G1  indiv1
2     G1  indiv1
3     G1  indiv2
4     G1  indiv2
5     G2  indiv3
6     G2  indiv3
7     G2  indiv4
8     G2  indiv4

I would like to create a new column in my data frame (retaining the long format) with the index of each individual in the group, that is:

   group   indiv  Ineed
1     G1  indiv1      1
2     G1  indiv1      1
3     G1  indiv2      2
4     G1  indiv2      2
5     G2  indiv3      1
6     G2  indiv3      1
7     G2  indiv4      2
8     G2  indiv4      2

I have tried with the data.table .N or .GRP methods, without success (nice work on data.table by the way!).

Any help much appreciated!

like image 827
xvrtzn Avatar asked Dec 25 '22 00:12

xvrtzn


1 Answers

You could use the new rleid function here (from the development version v >= 1.9.5)

setDT(df)[, Ineed := rleid(indiv), group][]
#    group  indiv Ineed
# 1:    G1 indiv1     1
# 2:    G1 indiv1     1
# 3:    G1 indiv2     2
# 4:    G1 indiv2     2
# 5:    G2 indiv3     1
# 6:    G2 indiv3     1
# 7:    G2 indiv4     2
# 8:    G2 indiv4     2

Or you could convert to factors (in order to create unique groups) and then convert them back to numeric (if you using the CRAN stable version v <= 1.9.4)

setDT(df)[, Ineed := as.numeric(factor(indiv)), group][]
#    group  indiv Ineed
# 1:    G1 indiv1     1
# 2:    G1 indiv1     1
# 3:    G1 indiv2     2
# 4:    G1 indiv2     2
# 5:    G2 indiv3     1
# 6:    G2 indiv3     1
# 7:    G2 indiv4     2
# 8:    G2 indiv4     2
like image 159
David Arenburg Avatar answered Jan 13 '23 12:01

David Arenburg