Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find the indices where there are n consecutive zeroes in a row

Suppose I have this data:

  x = c(14,14, 6,  7 ,14 , 0 ,0  ,0 , 0,  0,  0 , 0 , 0,  0 , 0 , 0 , 0,  9  ,1 , 3  ,8  ,9 ,15,  9 , 8, 13,  8,  4 , 6 , 7 ,10 ,13,  3,
 0 , 0 , 0 , 0 , 0 , 0,  0,  0 , 0 , 0 , 0,  0,  0,  0,  0  ,0,  0 , 0 , 0,  0,  0,  0,  0 , 0,  0, 4 , 7  ,4,  5 ,16 , 5  ,5 , 9 , 4  ,4,  9 , 8,  2,  0  ,0  ,0  ,0  ,0,  0,  0,  0  ,0 , 0,  0,  0,  0,  0,  0,  0,  0,0)

x
 [1] 14 14  6  7 14  0  0  0  0  0  0  0  0  0  0  0  0  9  1  3  8  9 15  9  8
[26] 13  8  4  6  7 10 13  3  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
[51]  0  0  0  0  0  0  0  0  4  7  4  5 16  5  5  9  4  4  9  8  2  0  0  0  0
[76]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  

I want to recover the indices beginning where there are more than 3 zeroes in a row and terminating with the last 0 before a nonzero.

For example,

I would get

6, 17 for the first rash of zeroes, etc.

like image 437
wolfsatthedoor Avatar asked May 13 '18 00:05

wolfsatthedoor


People also ask

How do you find the number of consecutive zeros?

We can see after length 12 sequence is repeating and in lengths of 12. And in a segment of length 12, there are total 2 pairs of consecutive zeros. Hence we can generalize the given pattern q = (2^n/12) and total pairs of consecutive zeros will be 2*q+1.

What is the meaning of consecutive zeroes?

Expert-verified answer [It means all numbers from 1 to 100 multiplied] and they ask you the number of zeroes. So, to solve these questions faster, you must know that every zero in a number is due to multiplication of one 5 and one 2.


2 Answers

If x happens to be a column of a data.table you can do

library(data.table)
dt <- data.table(x = x)

dt[, if(.N > 3 & all(x == 0)) .(starts = first(.I), ends = last(.I))
   , by = rleid(x)]

#    rleid starts ends
# 1:     5      6   17
# 2:    22     34   58
# 3:    34     72   89

Explanation:

  • rleid(x) gives an ID (integer) for each element in x indicating which "run" the element is a member of, where "run" means a sequence of adjacent equal values.

  • dt[, <code>, by = rle(x)] partitions dt according to rleid(x) and computes <code> for each subset of dt's rows. The results are stacked together in a single data.table.

  • .N is the number of elements in the given subset

  • .I is the vector of row numbers corresponding to the subset

  • first and last give the first and last element of a vector

  • .(<stuff>) is the same as list(<stuff>)

    The rleid function, by grouping within the brackets, .N and .I symbols, first and last functions are part of the data.table package.

like image 53
IceCreamToucan Avatar answered Sep 19 '22 22:09

IceCreamToucan


By using dplyr , get the diff then if the diff not equal to 0 , they are not belong to same group , after cumsum we get the grouid

library(dplyr)
df=data.frame('x'=x,rownumber=seq(length(x)))
df$Groupid=cumsum(c(0,diff(df$x==0))!=0)
df%>%group_by(Groupid)%>%summarize(start=first(rownumber),end=last(rownumber),number=first(x),size=n())%>%filter(number==0&size>=3)
# A tibble: 3 x 5
  Groupid start   end number  size
    <int> <int> <int>  <dbl> <int>
1       1     6    17      0    12
2       3    34    58      0    25
3       5    72    89      0    18
like image 45
BENY Avatar answered Sep 21 '22 22:09

BENY