I have a large data.frame structured as:
date torn_count torn_dpi
31-12-2014 1 0.001
28-12-2014 3 0.003
27-12-2014 1 0.001
22-12-2014 1 0.001
21-12-2014 1 0.001
20-12-2014 1 0.001
19-12-2014 1 0.001
18-12-2014 1 0.001
17-12-2014 3 0.003
15-12-2014 2 0.002
14-12-2014 1 0.001
13-12-2014 1 0.002
11-12-2014 5 0.104
10-12-2014 3 0.003
09-12-2014 1 0.001
08-12-2014 5 0.104
01-12-2014 5 0.014
30-11-2014 2 0.002
29-11-2014 1 0.001
26-11-2014 1 0.001
25-11-2014 2 0.045
24-11-2014 2 1.83
I am trying to add an index that has the same value over consecutive days and is increasing over non-consecutive days. Something like:
date torn_count torn_dpi index
31/12/2014 1 0.001 1
28/12/2014 3 0.003 2
27/12/2014 1 0.001 2
22/12/2014 1 0.001 3
21/12/2014 1 0.001 3
20/12/2014 1 0.001 3
19/12/2014 1 0.001 3
18/12/2014 1 0.001 3
17/12/2014 3 0.003 3
15/12/2014 2 0.002 4
14/12/2014 1 0.001 4
13/12/2014 1 0.002 4
Any help on the above problem would be greatly appreciated.
We can use data.table. We convert the 'data.frame' to 'data.table' (setDT(df1)), convert the 'date' to Date class, get the difference between adjacent element, check if that is not equal to 1, and do the cumsum to create the 'index' column.
library(data.table)
setDT(df1)[, index:= cumsum(c(TRUE,abs(diff(as.Date(date, '%d-%m-%Y')))!=1))]
head(df1, 12)
# date torn_count torn_dpi index
# 1: 31-12-2014 1 0.001 1
# 2: 28-12-2014 3 0.003 2
# 3: 27-12-2014 1 0.001 2
# 4: 22-12-2014 1 0.001 3
# 5: 21-12-2014 1 0.001 3
# 6: 20-12-2014 1 0.001 3
# 7: 19-12-2014 1 0.001 3
# 8: 18-12-2014 1 0.001 3
# 9: 17-12-2014 3 0.003 3
#10: 15-12-2014 2 0.002 4
#11: 14-12-2014 1 0.001 4
#12: 13-12-2014 1 0.002 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With