I am literally stuck on this. The df1 has the following variables:
serial = Group of people 
id1 = the person from the group (eg. 12 (serial) 1 (id1) =group 12 person 1; 12 2 = group 12 person 2, etc. ) 
'Day'when the first (or start) recording was made.
The days consist of equal number of observations (eg.95)
        day1 (Monday)  =  day11-day196 
        day2 (Tuesday) = day21-day296     
        day3 (Wednesday) =  day31-day396   
        day4 (Thursday) =  day41-day496   
        day5 (Friday) = day51-day596      
        day6 (Saturday) = day61-day696   
        day7 (Sunday) =  day71-day796  
Example of df1
serial id1  Day     day1 day2 day3 day4 day5 day6 day7
12      1   Monday    2    1    2    1    1    3    1
123     1   Tuesday   0    3    0    3    3    0    3
10      1   Wednesday 0    3    3    3    3    3    3
I would like to identify the consecutive records (there is no gap between the daily records) and the total amount of the records.
The starting day for consecutive recordings is the 'Day` variable. For example a consecutive record would be serial 12. Recording started on Monday and there are records (at leas one from 95 variable) during the week. During the week (7 x 95 variable) there were made 11 records
A non-consecutive record would be id 123 as the there is a gap day on day3 and day6. Record started on Tuesday and there is a gap on Wednesday and Saturday.
Finally I would like to record the duration of the consecutive recording.
Sample output:
 serial  id1   Duration Occurance        Days
12       1      11        7        day1 day2 day3 day4 day5 day6 day7
123      1      12        0        0
10       1      18        5        day3 day4 day5 day6 day7
Sample data
structure(list(serial = c(12, 123, 10), id1 = c(1, 1, 1), Day = structure(1:3, .Label = c("Monday",
"Tuesday", "Wednesday"), class = "factor"), day1 = c(2, 0, 0),
day2 = c(1, 3, 3), day3 = c(2, 0, 3), day4 = c(1, 3, 3),
day5 = c(1, 3, 3), day6 = c(3, 0, 3), day7 = c(1, 3, 3)), row.names = c(NA,
3L), class = "data.frame")
Similar post R - identify consecutive sequences
We can use rleid from data.table to get the 'Occurance' correct
library(data.table)
wkdays <- c("Monday", "Tuesday", "Wednesday", "Thursday", 
"Friday", "Saturday", "Sunday")
out1 <-  do.call(rbind, Map(function(x, y) {
              i1 <- match(y, wkdays): length(x)
              i2 <- x[i1] != 0
              i3 <- all(i2)
              grp1 <- rleid(i2)
              Days <- if(i3) tapply(names(x)[i1][i2], grp1[i2], FUN = paste, collapse= ' ') else ''
             Occurance <- if(i3) length(grp1[i2]) else 0
             data.frame(Occurance, Days)
            }, asplit(df[-(1:3)], 1), df$Day))
 out1$Duration <- rowSums(df1[startsWith(names(df1), 'day')])
 out1
 # Occurance                               Days Duration
 #1         7 day1 day2 day3 day4 day5 day6 day7       11
 #2         0                                          12
 #3         5           day3 day4 day5 day6 day7       18
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With