Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

generate sequence of numbers in R according to other variables

I have problem to generate a sequence of number according on two other variables. Specifically, I have the following DB (my real DB is not so balanced!):

ID1=rep((1:1),20)
ID2=rep((2:2),20)
ID3=rep((3:3),20)
ID<-c(ID1,ID2,ID3)
DATE1=rep("2013-1-1",10)
DATE2=rep("2013-1-2",10)
DATE=c(DATE1,DATE2)
IN<-data.frame(ID,DATE=rep(DATE,3))

and I would like to generate a sequence of number according to the number of observation per each ID for each DATE, like this:

OUTPUT<-data.frame(ID,DATE=rep(DATE,3),N=rep(rep(seq(1:10),2),3))

Curiously, I try the following solution that works for the DB provided above, but not for the real DB!

IN$UNIQUE<-with(IN,as.numeric(interaction(IN$ID,IN$DATE,drop=TRUE,lex.order=TRUE)))#generate unique value for the combination of id and date
PROG<-tapply(IN$DATE,IN$UNIQUE,seq)#generate the sequence
OUTPUT$SEQ<-c(sapply(PROG,"["))#concatenate the sequence in just one vector

Right now, I can not understand why the solution doesn't work for the real DB, as always any tips is greatly appreciated!

Here there is an example (just one ID included) of the data-set:

  id       date
  1  F2_G 2005-03-09
  2  F2_G 2005-06-18
  3  F2_G 2005-06-18
  4  F2_G 2005-06-18
  5  F2_G 2005-06-19
  6  F2_G 2005-06-19
  7  F2_G 2005-06-19
  8  F2_G 2005-06-19
  9  F2_G 2005-06-20
like image 965
stefano Avatar asked Dec 21 '25 10:12

stefano


2 Answers

Here's one using ave:

OUT <- within(IN, {N <- ave(ID, list(ID, DATE), FUN=seq_along)})
like image 173
Arun Avatar answered Dec 24 '25 01:12

Arun


This should do what you want...

require(reshape2)
as.vector( apply( dcast( IN , ID ~ DATE , length )[,-1] , 1:2 , function(x)seq.int(x) ) )
 [1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6
 [27]  7  8  9 10  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10  1  2
 [53]  3  4  5  6  7  8  9 10

Bascially we use dcast to get the number of observations by ID and date like so

dcast( IN , ID ~ DATE , length )
  ID 2013-1-1 2013-1-2
1  1       10       10
2  2       10       10
3  3       10       10

Then we use apply across each cell to make a sequence of integers as long as the count of ID for each date. Finally we coerce back to a vector using as.vector.

like image 41
Simon O'Hanlon Avatar answered Dec 23 '25 23:12

Simon O'Hanlon



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!