Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create group ID for runs of non-zero values

Tags:

r

vector

grouping

I would like to find contiguous runs of non-zero elements in a vector (separated by at least one zero), and to assign an ID to each group (subsequent integer).

Toy vector:

value <- c(1, 1, 2, 3, 4, 3, 0, 0, 0, 1, 2, 3, 9, 8, 0, 0, 3, 2)

In this example, there are three runs of non-zero values: [1,1,2,3,4,3], [1,2,3,9,8], [3,2], separated by chunks of one or more zeros.

Each non-zero run should have a unique ID: 1, 2, 3... Runs of zero should have NA as ID:

   value id
1      1  1
2      1  1
3      2  1
4      3  1
5      4  1
6      3  1
7      0 NA
8      0 NA
9      0 NA
10     1  2
11     2  2
12     3  2
13     9  2
14     8  2
15     0 NA
16     0 NA
17     3  3
18     2  3
like image 329
quarandoo Avatar asked Mar 31 '17 21:03

quarandoo


3 Answers

You can try:

as.integer(factor(cumsum(value==0)*NA^(value==0)))
#[1]  1  1  1  1  1  1 NA NA NA  2  2  2  2  2 NA NA  3  3
like image 191
nicola Avatar answered Nov 20 '22 06:11

nicola


Using rle(). First create a new vector replacing the zeros with NA.

x <- match(value != 0, TRUE)
with(rle(!is.na(x)), {
    lv <- lengths[values]
    replace(x, !is.na(x), rep(seq_along(lv), lv))
})
# [1]  1  1  1  1  1  1 NA NA NA  2  2  2  2  2 NA NA  3  3
like image 7
Rich Scriven Avatar answered Nov 20 '22 07:11

Rich Scriven


You could also do this:

id <- (value>0)^NA
x <- rle(value>0)$lengths[c(TRUE, FALSE)]
id[!is.na(id)] <- rep(seq_along(x), times=x)

#[1]  1  1  1  1  1  1 NA NA NA  2  2  2  2  2 NA NA  3  3
like image 2
989 Avatar answered Nov 20 '22 07:11

989