Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

More elegant way to return a sequence of numbers based on booleans?

Tags:

r

Here's a sample of booleans I have as part of a data.frame:

atest <- c(FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, FALSE)

I want to return a sequence of numbers starting at 1 from each FALSE and increasing by 1 until the next FALSE.

The resulting desired vector is:

[1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10  1

Here's the code that accomplishes this, but I'm sure there's a simpler or more elegant way to do this in R. I'm always trying to learn how to code things more efficiently in R rather than simply getting the job done.

result <- c()
x <- 1
for(i in 1:length(atest)){
    if(atest[i] == FALSE){
        result[i] <- 1
        x <- 1
    } 
    if(atest[i] != FALSE){
        x <- x+1
         result[i] <- x
    }
}
like image 324
tcash21 Avatar asked Jul 23 '13 20:07

tcash21


2 Answers

Here's one way to do it, using handy (but not widely-known/used) base functions:

> sequence(tabulate(cumsum(!atest)))
 [1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10  1

To break it down:

> # return/repeat integer for each FALSE
> cumsum(!atest)
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3
> # count the number of occurrences of each integer
> tabulate(cumsum(!atest))
[1] 10 10  1
> # create concatenated seq_len for each integer
> sequence(tabulate(cumsum(!atest)))
 [1]  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  6  7  8  9 10  1
like image 165
Joshua Ulrich Avatar answered Sep 19 '22 11:09

Joshua Ulrich


Here is another approach using other familiar functions:

seq_along(atest) - cummax(seq_along(atest) * !atest) + 1L

Because it is all vectorized, it is noticeably faster than @Joshua's solution (if speed is of any concern):

f0 <- function(x) sequence(tabulate(cumsum(!x)))
f1 <- function(x) {i <- seq_along(x); i - cummax(i * !x) + 1L}
x  <- rep(atest, 10000)

library(microbenchmark)
microbenchmark(f0(x), f1(x))
# Unit: milliseconds
#   expr       min        lq    median        uq      max neval
#  f0(x) 19.386581 21.853194 24.511783 26.703705 57.20482   100
#  f1(x)  3.518581  3.976605  5.962534  7.763618 35.95388   100

identical(f0(x), f1(x))
# [1] TRUE
like image 28
flodel Avatar answered Sep 18 '22 11:09

flodel