I'm trying to get count the instances of 3 consecutive "a" events, "aaa"
.
The string will comprise the lower alphabet, e.g. "abaaaababaaa"
I tried the following piece of code. But the behavior is not precisely what I am looking for.
x<-"abaaaababaaa";
gregexpr("aaa",x);
I would like the match to return 3 instances of the "aaa" occurrence as opposed to 2.
Assume indexation begins with 1
To catch the overlapping matches, you can use a lookahead like this:
gregexpr("a(?=aa)", x, perl=TRUE)
However, your matches are now just a single "a", so it might complicate further processing of these matches, especially if you're not always looking for fixed-length patterns.
I know I'm late, but I wanted to share this solution,
your.string <- "abaaaababaaa"
nc1 <- nchar(your.string)-1
x <- unlist(strsplit(your.string, NULL))
x2 <- c()
for (i in 1:nc1)
x2 <- c(x2, paste(x[i], x[i+1], x[i+2], sep=""))
cat("ocurrences of <aaa> in <your.string> is,",
length(grep("aaa", x2)), "and they are at index", grep("aaa", x2))
> ocurrences of <aaa> in <your.string> is, 3 and they are at index 3 4 10
Heavily inspired by this answer from R-help by Fran.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With