I want to calculate how long its been since something occurred.
Given the following, you can see that the light is on some of the time, but not all of the time. I want to normalize the data to feed it to a neural network.
library(data.table)
d<-data.table(
date = c("6/1/2013", "6/2/2013","6/3/2013","6/4/2013"),
light = c(TRUE,FALSE,FALSE,TRUE)
)
d
date light
1: 6/1/2013 TRUE
2: 6/2/2013 FALSE
3: 6/3/2013 FALSE
4: 6/4/2013 TRUE
what I'd like to calculate is another column that shows the "distance" to the last occurrence.
so for the data above: first row, since its on it should be zero second row, should be 1 third row, should be 2 fourth row, should be zero
I would suggest creating a grouping column based on when there is a switch from FALSE to TRUE:
# create group column
d[c(light), group := cumsum(light)]
d[is.na(group), group:=0L]
d[, group := cumsum(group)]
d
Then simply tally by group, using cumsum
and negating light
:
d[, distance := cumsum(!light), by=group]
# remove the group column for cleanliness
d[, group := NULL]
d
date light distance
1: 2013-06-01 TRUE 0
2: 2013-06-02 FALSE 1
3: 2013-06-03 FALSE 2
4: 2013-06-04 TRUE 0
5: 2013-06-05 TRUE 0
6: 2013-06-06 FALSE 1
7: 2013-06-07 FALSE 2
8: 2013-06-08 TRUE 0
I added a few rows
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With