I feel like this is a fairly easy question, but for the life of me I can't seem to find the answer. I have a fairly standard dataframe, and what I am trying to do is sum the a column of values until they reach some value (either that exact value or greater than it), at which point it drops a 1 into a new column (labelled keep) and restarts the summing at 0.
I have a column of minutes, the differences between the minutes, a keep column, and a cumulative sum column (the example I am using is much cleaner than the actual full dataset)
minutes difference keep difference_sum
1052991158 0 0 0
1052991338 180 0 180
1052991518 180 0 360
1052991698 180 0 540
1052991878 180 0 720
1052992058 180 0 900
1052992238 180 0 1080
1052992418 180 0 1260
1052992598 180 0 1440
1052992778 180 0 1620
1052992958 180 0 1800
The difference sum column was calculated with the code
caribou.sub$difference_sum<-cumsum(difference)
What I would like to do is run the above code with the condition that, when the summed value reaches either 1470 or any number greater than that it puts a 1 in the keep column and then restarts summing afterwards, and continues running throughout the dataset.
Thanks in advance, and if you need any more information let me know.
Ayden
I still don't understand about when the sum should restart and if it should be zero then. A desired result would help greatly.
Nonetheless, I can't help but think that simply indexing and subtraction would be a straightforward way of doing this. The below code gives the same result as @Henrik's solution.
df$difference_sum <- cumsum(df$difference)
step <- (df$difference_sum %/% 1470) + 1
k <- which(diff(step) > 0) + 1
df$keep <- 0
df$keep[k] <- 1
step[k] <- step[k] - 1
df$difference_sum <- df$difference_sum - c(0, df$difference_sum[k])[step]
I think this is best done with a for loop, can't think of a function that could do so out of the box. The following should do what you want (if I understand you correctly).
current.sum <- 0
for (c in 1:nrow(caribou.sub)) {
current.sum <- current.sum + caribou.sub[c, "difference"]
carribou.sub[c, "difference_sum"] <- current.sum
if (current.sum >= 1470) {
caribou.sub[c, "keep"] <- 1
current.sum <- 0
}
}
Feel free to comment if it does not exactly what you want. But as pointed out by alexwhan, your description is not completely clear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With