Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find first sequence of length n in R

Tags:

r

Lets say I have such a data.frame

df <- data.frame(signal = c(0, 0, 1, 0, 1, 1, 0, 1, 1, 1))

What is the best way to to find first signal by the number ones that go in succession n times. For example if n = 1 then my signal would be third element and I would like to get an answer like this:

c(0, 0, 1, 0, 0, 0, 0, 0, 0, 0)

For n=2 answer would be:

c(0, 0, 0, 0, 0, 1, 0, 0, 0, 0)

And for n=3 last element is signal after 3 ones in a row:

c(0, 0, 0, 0, 0, 0, 0, 0, 0, 1)
like image 696
nesvarbu Avatar asked Mar 15 '16 16:03

nesvarbu


2 Answers

The 1st 1 in the rolling product of signal with window size=n is the start of the signal, so

f <- function(x, n){
  y <- numeric(length(x))
  k <- RcppRoll::roll_prod(x, n)
  y[which(k==1)[1] + n-1] <- 1
  y
}

> f(df$signal, 1)
 [1] 0 0 1 0 0 0 0 0 0 0
> f(df$signal, 2)
 [1] 0 0 0 0 0 1 0 0 0 0
> f(df$signal, 3)
 [1] 0 0 0 0 0 0 0 0 0 1

Sanity Check

set.seed(1)
signal <- sample(0:1, 10, TRUE)
signal
# [1] 0 0 1 1 0 1 1 1 1 0
f(signal, 3)
# [1] 0 0 0 0 0 0 0 1 0 0
g(signal, 3)
# [1] 1 0 0 0 0 0 0 0 0 0
fun(signal, 3)
Error in 1:which(r$len * r$val == n)[1] : NA/NaN argument
like image 98
Khashaa Avatar answered Oct 21 '22 13:10

Khashaa


x <- c(0, 0, 1, 0, 1, 1, 0, 1, 1, 1)

y <- rle(x)
y$values <- y$lengths * y$values
(y <- inverse.rle(y))
# [1] 0 0 1 0 2 2 0 3 3 3

f <- function(n) {z <- rep(0, length(y)); z[which.max(cumsum(y == n))] <- 1; z}
f(1)
# [1] 0 0 1 0 0 0 0 0 0 0

f(2)
# [1] 0 0 0 0 0 1 0 0 0 0

f(3)
# [1] 0 0 0 0 0 0 0 0 0 1

The full function would be

g <- function(x, n) {
  y <- rle(x)
  y$values <- y$lengths * y$values
  y <- inverse.rle(y)
  z <- rep_len(0, length(x))
  z[which.max(cumsum(y == n))] <- 1
  z
}
g(x, 1)
g(x, 2)
g(x, 3)

edit version 2

g <- function(x, n, ties = c('first','random','last')) {
  ties <- match.arg(ties)
  FUN <- switch(ties, first = min, last = max,
                random = function(x) x[sample.int(length(x), 1)])
  y <- rle(x)
  y$values <- y$lengths * y$values
  y <- inverse.rle(y)
  z <- rep_len(0, length(x))
  if (!length(wh <- which(y == n)))
    return(z)
  wh <- wh[seq_along(wh) %% n == 0]
  z[FUN(wh)] <- 1
  z
}

x <- c(0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1)

g(x, 1, 'first')
# [1] 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0

g(x, 1, 'last')
# [1] 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

g(x, 1, 'random')
# [1] 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0

g(x, 4)
# [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
like image 5
rawr Avatar answered Oct 21 '22 15:10

rawr