my simplified data looks like this:
set.seed(1453); x = sample(0:1, 10, TRUE)
date = c('2016-01-01', '2016-01-05', '2016-01-07', '2016-01-12', '2016-01-16', '2016-01-20',
'2016-01-20', '2016-01-25', '2016-01-26', '2016-01-31')
df = data.frame(x, date = as.Date(date))
df
x date
1 2016-01-01
0 2016-01-05
1 2016-01-07
0 2016-01-12
0 2016-01-16
1 2016-01-20
1 2016-01-20
0 2016-01-25
0 2016-01-26
1 2016-01-31
I'd like to calculate the number of occurrences for x == 1
within a specified time period, e.g. 14 and 30 days from the current date (but excluding the current entry, if it is x == 1
. The desired output would look like this:
solution
x date x_plus14 x_plus30
1 2016-01-01 1 3
0 2016-01-05 1 4
1 2016-01-07 2 3
0 2016-01-12 2 3
0 2016-01-16 2 3
1 2016-01-20 2 2
1 2016-01-20 1 1
0 2016-01-25 1 1
0 2016-01-26 1 1
1 2016-01-31 0 0
Ideally, I'd like this to be in dplyr
, but it is not a must. Any ideas how to achieve this? Thanks a lot for your help!
Adding another approach based on findInterval
:
cs = cumsum(df$x) # cumulative number of occurences
data.frame(df,
plus14 = cs[findInterval(df$date + 14, df$date, left.open = TRUE)] - cs,
plus30 = cs[findInterval(df$date + 30, df$date, left.open = TRUE)] - cs)
# x date plus14 plus30
#1 1 2016-01-01 1 3
#2 0 2016-01-05 1 4
#3 1 2016-01-07 2 3
#4 0 2016-01-12 2 3
#5 0 2016-01-16 2 3
#6 1 2016-01-20 2 2
#7 1 2016-01-20 1 1
#8 0 2016-01-25 1 1
#9 0 2016-01-26 1 1
#10 1 2016-01-31 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With