Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the longest continuous chunk of TRUE in a boolean vector

Tags:

r

boolean

Given a boolean vector, how can I find the longest continuous chunk of TRUE and change the rest TRUE values to FALSE?

For example, given a boolean vector:

bool = c(TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)

How can I get a vector like:

c(FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
like image 589
sl1129 Avatar asked May 25 '16 20:05

sl1129


3 Answers

Using rle:

myRle <- rle(bool)$length
rep(myRle == max(myRle), myRle)

OP didn't provide answers to possible issues with this approach, but the complete answer is proposed by docendodiscimus should cover all possible issues.

like image 64
5 revs Avatar answered Nov 19 '22 14:11

5 revs


Here's an approach that will highlight all longest chunks of consecutive TRUEs in a boolean vector. That means, if there are, say, two TRUE chunks of the same (max) length, both will be reported as TRUE in the output.

We can use:

with(rle(bool), rep(lengths == max(lengths[values]) & values, lengths))

which means:

  • with(rle(bool), ...): compute the run lengths
  • lengths == max(lengths[values]) & values: check if each run length is equal to the maximum run length where values is TRUE and also check if values itself is TRUE
  • rep(...., lengths): repeat each of the resulting logicals as often as it's own run length

OP's test case:

bool <- c(TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE, FALSE)
with(rle(bool), rep(lengths == max(lengths[values]) & values, lengths))
# [1] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE

Second test case: same maxima for T and F:

x <- c(TRUE, TRUE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, FALSE, TRUE)
with(rle(x), rep(lengths == max(lengths[values]) & values, lengths))
# [1]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

Third test case: F longer chunk than T:

y <- c(TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, TRUE)
with(rle(y), rep(lengths == max(lengths[values]) & values, lengths))
# [1]  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
like image 43
2 revs Avatar answered Nov 19 '22 14:11

2 revs


With inspiration from @zx8754

This should work even when the longest overall sequence is made of FALSE.

runs <- rle(bool)
lengths <- runs$lengths

is_max <- which(lengths == max(lengths[runs$values]) & runs$values)
rep(1:length(lengths) == is_max[1], lengths)
like image 1
Mhairi McNeill Avatar answered Nov 19 '22 14:11

Mhairi McNeill