I am literally stuck on this. The df1
has the following variables:
serial
= Group of people
id1
= the person from the group (eg. 12 (serial) 1 (id1)
=group 12 person 1; 12 2 = group 12 person 2, etc
. )
'Day
'when the first (or start) recording was made.
The days consist of equal number of observations (eg.95)
day1 (Monday) = day11-day196
day2 (Tuesday) = day21-day296
day3 (Wednesday) = day31-day396
day4 (Thursday) = day41-day496
day5 (Friday) = day51-day596
day6 (Saturday) = day61-day696
day7 (Sunday) = day71-day796
Example of df1
serial id1 Day day1 day2 day3 day4 day5 day6 day7
12 1 Monday 2 1 2 1 1 3 1
123 1 Tuesday 0 3 0 3 3 0 3
10 1 Wednesday 0 3 3 3 3 3 3
I would like to identify the consecutive records (there is no gap between the daily records) and the total amount of the records.
The starting day for consecutive recordings is the 'Day` variable. For example a consecutive record would be serial 12. Recording started on Monday and there are records (at leas one from 95 variable) during the week. During the week (7 x 95 variable) there were made 11 records
A non-consecutive record would be id 123 as the there is a gap day on day3 and day6. Record started on Tuesday and there is a gap on Wednesday and Saturday.
Finally I would like to record the duration of the consecutive recording.
Sample output:
serial id1 Duration Occurance Days
12 1 11 7 day1 day2 day3 day4 day5 day6 day7
123 1 12 0 0
10 1 18 5 day3 day4 day5 day6 day7
Sample data
structure(list(serial = c(12, 123, 10), id1 = c(1, 1, 1), Day = structure(1:3, .Label = c("Monday",
"Tuesday", "Wednesday"), class = "factor"), day1 = c(2, 0, 0),
day2 = c(1, 3, 3), day3 = c(2, 0, 3), day4 = c(1, 3, 3),
day5 = c(1, 3, 3), day6 = c(3, 0, 3), day7 = c(1, 3, 3)), row.names = c(NA,
3L), class = "data.frame")
Similar post R - identify consecutive sequences
We can use rleid
from data.table
to get the 'Occurance' correct
library(data.table)
wkdays <- c("Monday", "Tuesday", "Wednesday", "Thursday",
"Friday", "Saturday", "Sunday")
out1 <- do.call(rbind, Map(function(x, y) {
i1 <- match(y, wkdays): length(x)
i2 <- x[i1] != 0
i3 <- all(i2)
grp1 <- rleid(i2)
Days <- if(i3) tapply(names(x)[i1][i2], grp1[i2], FUN = paste, collapse= ' ') else ''
Occurance <- if(i3) length(grp1[i2]) else 0
data.frame(Occurance, Days)
}, asplit(df[-(1:3)], 1), df$Day))
out1$Duration <- rowSums(df1[startsWith(names(df1), 'day')])
out1
# Occurance Days Duration
#1 7 day1 day2 day3 day4 day5 day6 day7 11
#2 0 12
#3 5 day3 day4 day5 day6 day7 18
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With