Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting consecutive patterns in strings using R

Tags:

r

stringr

I'm attempting to write a function to count the number of consecutive instances of a pattern. As an example, I'd like the string

string<-"A>A>A>B>C>C>C>A>A"

to be transformed into

"3 A > 1 B > 3 C > 2 A"

I've got a function that counts the instances of each string, see below. But it doesn't achieve the ordering effect that I'd like. Any ideas or pointers?

Thanks,

R

Existing function:

fnc_gen_PathName <- function(string) {
p <- strsplit(as.character(string), ";")
p1 <- lapply(p, table)
p2 <- lapply(p1, function(x) {
sapply(1:length(x), function(i) {
  if(x[i] == 25){
    paste0(x[i], "+ ", names(x)[i])
  } else{
    paste0(x[i], "x ", names(x)[i])
  }
})
})
p3 <- lapply(p2, function(x) paste(x, collapse = "; "))
p3 <- do.call(rbind, p3)
return(p3)
}
like image 796
Robin Sheridan Avatar asked Dec 01 '15 15:12

Robin Sheridan


1 Answers

As commented by @MrFlick you could try the following using rle and strsplit

with(rle(strsplit(string, ">")[[1]]), paste(lengths, values, collapse = " > "))
## [1] "3 A > 1 B > 3 C > 2 A"
like image 97
David Arenburg Avatar answered Jan 03 '23 17:01

David Arenburg