Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

fill values between interval grouped by ID

Tags:

r

I have a data set where subjects have a value of 1 or 0 at different times. I need a function or a piece of code to that feels with 1, the values of 0 between the first and last 1.

I have tried complete() and fill() but not doing what I want

I have the following data:

dat = tibble(ID = c(1,1,1,1,1,1,1,1,1,1,
                    2,2,2,2,2,2,2,2,2,2,
                    3,3,3,3,3,3,3,3,3,3),
             TIME = c(1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10),
             DV = c(0,0,1,1,0,0,1,0,0,0,
                    0,1,0,0,0,0,0,0,0,1,
                    0,1,0,1,0,1,0,1,0,0))

# A tibble: 30 x 3
      ID  TIME    DV
   <dbl> <dbl> <dbl>
 1     1     1     0
 2     1     2     0
 3     1     3     1
 4     1     4     1
 5     1     5     0
 6     1     6     0
 7     1     7     1
 8     1     8     0
 9     1     9     0
10     1    10     0
# ... with 20 more rows

I need the following output as shown in DV2:

dat = tibble(ID = c(1,1,1,1,1,1,1,1,1,1,
                    2,2,2,2,2,2,2,2,2,2,
                    3,3,3,3,3,3,3,3,3,3),
             TIME = c(1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10,
                      1,2,3,4,5,6,7,8,9,10),
             DV = c(0,0,1,1,0,0,1,0,0,0,
                    0,1,0,0,0,0,0,0,0,1,
                    0,1,0,1,0,1,0,1,0,0),
             DV2 = c(0,0,1,1,1,1,1,0,0,0,
                    0,1,1,1,1,1,1,1,1,1,
                    0,1,1,1,1,1,1,1,0,0))

# A tibble: 30 x 4
      ID  TIME    DV   DV2
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     0     0
 2     1     2     0     0
 3     1     3     1     1
 4     1     4     1     1
 5     1     5     0     1
 6     1     6     0     1
 7     1     7     1     1
 8     1     8     0     0
 9     1     9     0     0
10     1    10     0     0
# ... with 20 more rows
like image 827
Mario González Sales Avatar asked Feb 07 '26 03:02

Mario González Sales


1 Answers

With dplyr, you can do:

dat %>%
 rowid_to_column() %>%
 group_by(ID) %>%
 mutate(DV2 = if_else(rowid %in% min(rowid[DV == 1]):max(rowid[DV == 1]),
                      1, 0)) %>%
 ungroup() %>%
 select(-rowid)

      ID  TIME    DV   DV2
   <dbl> <dbl> <dbl> <dbl>
 1     1     1     0     0
 2     1     2     0     0
 3     1     3     1     1
 4     1     4     1     1
 5     1     5     0     1
 6     1     6     0     1
 7     1     7     1     1
 8     1     8     0     0
 9     1     9     0     0
10     1    10     0     0
like image 174
tmfmnk Avatar answered Feb 09 '26 08:02

tmfmnk