Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count repetitions of a set of characters

Tags:

r

How can I count repetitions of a set of characters in a vector? Imagine the following vector consisting of "A" and "B":

x <- c("A", "A", "A", "B", "B", "A", "A", "B", "A")

In this example, the first set would be the sequence of "A" and "B" from index 1 to 5, the second set is the sequence of "A" and "B" from index 6 to 8, and then the third set is the last single "A":

x <- c("A", "A", "A", "B", "B", # set 1
       "A", "A", "B",           # set 2
       "A")                     # set 3

How can set a counter for each set of variables? I need a vector like this:

c(1, 1, 1, 1, 1, 2, 2, 2, 3)  

thanks

like image 881
Christian Avatar asked Jan 16 '17 11:01

Christian


1 Answers

Use rle:

x <- c("A", "A", "A", "B", "B", "A", "A", "B", "A")  
tmp <- rle(x)
#Run Length Encoding
#  lengths: int [1:5] 3 2 2 1 1
#  values : chr [1:5] "A" "B" "A" "B" "A"

Now change the values:

tmp$values <- ave(rep(1L, length(tmp$values)), tmp$values, FUN = cumsum) 

and inverse the run length encoding:

y <- inverse.rle(tmp)
#[1] 1 1 1 1 1 2 2 2 3
like image 58
Roland Avatar answered Sep 28 '22 12:09

Roland