I have the following dataframe
x y count
1 1 2018-02-24 4.031540
2 2 2018-02-25 5.244303
3 3 2018-02-26 5.441465
4 NA 2018-02-27 4.164104
5 5 2018-02-28 5.172919
6 6 2018-03-01 5.591410
7 7 2018-03-02 4.691716
8 8 2018-03-03 5.465360
9 9 2018-03-04 3.269378
10 NA 2018-03-05 5.300679
11 11 2018-03-06 5.489664
12 12 2018-03-07 4.423334
13 13 2018-03-08 3.808764
14 14 2018-03-09 6.450136
15 15 2018-03-10 5.541785
16 16 2018-03-11 4.762889
17 17 2018-03-12 5.511649
18 18 2018-03-13 6.795386
19 19 2018-03-14 6.615762
20 20 2018-03-15 4.749151
I want to take the cumsum
of the count column, but I want the the cumsum
to restart when the x value is NA
. I've tried the following:
df$cum_sum <- ifelse(is.na(df$x) == FALSE, cumsum(df$count), 0)
x y count cum_sum
1 1 2018-02-24 4.031540 4.031540
2 2 2018-02-25 5.244303 9.275843
3 3 2018-02-26 5.441465 14.717308
4 NA 2018-02-27 4.164104 0.000000
5 5 2018-02-28 5.172919 24.054331
6 6 2018-03-01 5.591410 29.645741
7 7 2018-03-02 4.691716 34.337458
8 8 2018-03-03 5.465360 39.802817
9 9 2018-03-04 3.269378 43.072195
10 NA 2018-03-05 5.300679 0.000000
11 11 2018-03-06 5.489664 53.862538
12 12 2018-03-07 4.423334 58.285871
13 13 2018-03-08 3.808764 62.094635
14 14 2018-03-09 6.450136 68.544771
15 15 2018-03-10 5.541785 74.086556
16 16 2018-03-11 4.762889 78.849445
17 17 2018-03-12 5.511649 84.361094
18 18 2018-03-13 6.795386 91.156480
19 19 2018-03-14 6.615762 97.772242
20 20 2018-03-15 4.749151 102.521394
The result is the cum_sum column is 0 at the NA values, but the cumsum doesn't reset. How can I fix this?
A possible solution:
dat$cum_sum <- ave(dat$count, cumsum(is.na(dat$x)), FUN = cumsum)
which gives:
> dat x y count cum_sum 1 1 2018-02-24 4.031540 4.031540 2 2 2018-02-25 5.244303 9.275843 3 3 2018-02-26 5.441465 14.717308 4 NA 2018-02-27 4.164104 4.164104 5 5 2018-02-28 5.172919 9.337023 6 6 2018-03-01 5.591410 14.928433 7 7 2018-03-02 4.691716 19.620149 8 8 2018-03-03 5.465360 25.085509 9 9 2018-03-04 3.269378 28.354887 10 NA 2018-03-05 5.300679 5.300679 11 11 2018-03-06 5.489664 10.790343 12 12 2018-03-07 4.423334 15.213677 13 13 2018-03-08 3.808764 19.022441 14 14 2018-03-09 6.450136 25.472577 15 15 2018-03-10 5.541785 31.014362 16 16 2018-03-11 4.762889 35.777251 17 17 2018-03-12 5.511649 41.288900 18 18 2018-03-13 6.795386 48.084286 19 19 2018-03-14 6.615762 54.700048 20 20 2018-03-15 4.749151 59.449199
Or with dplyr
:
library(dplyr)
dat %>%
group_by(grp = cumsum(is.na(x))) %>%
mutate(cum_sum = cumsum(count)) %>%
ungroup() %>%
select(-grp)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With