This is my first time asking a question so bear with me.
My dataset (df) is like so:
animal azimuth south distance
pb1 187.561 1 1.992
pb1 147.219 1 8.567
pb1 71.032 0 5.754
pb1 119.502 1 10.451
pb2 101.702 1 9.227
pb2 85.715 0 8.821
I want to create an additional column (df$cumdist
) that adds cumulative distance, but within each individual animal and only if df$south==1
. I want the cumulative sum to reset with df$south==0
.
This is what I would like the result to be (done manually):
animal azimuth south distance cumdist
pb1 187.561 1 1.992 1.992
pb1 147.219 1 8.567 10.559
pb1 71.032 0 5.754 0
pb1 119.502 1 10.451 10.451
pb2 101.702 1 9.227 9.227
pb2 85.715 0 8.821 0
This is code I tried to implement the cumsum:
swim.az$cumdist <- cumsum(ifelse(swim.az$south==1, swim.az$distance, 0))
While it successfully stops adding when df$south==0
, it does not reset. Additionally, I know I will need to embed this in a for loop to subset by animal.
Thanks so much!
We multiply the 'south' with 'distance' ('cumdist') to change the values in 'distance' that corresponds to 0 in 'south' to 0, grouped by 'animal' and the group created by taking the cumulative sum of logical vector (south == 0
), get the cumsum
of 'cumdist', ungroup
and remove the columns that are not needed (grp
)
library(dplyr)
dfN %>%
mutate(cumdist = south * distance) %>%
group_by(animal, grp = cumsum(south == 0)) %>%
mutate(cumdist = cumsum(cumdist)) %>%
ungroup %>%
select(-grp)
# A tibble: 6 x 5
# animal azimuth south distance cumdist
# <chr> <dbl> <int> <dbl> <dbl>
#1 pb1 188. 1 1.99 1.99
#2 pb1 147. 1 8.57 10.6
#3 pb1 71.0 0 5.75 0
#4 pb1 120. 1 10.5 10.5
#5 pb2 102. 1 9.23 9.23
#6 pb2 85.7 0 8.82 0
Or a similar approach with base R
with(dfN, ave(distance * south, animal, cumsum(!south), FUN = cumsum))
#[1] 1.992 10.559 0.000 10.451 9.227 0.000
dfN <- structure(list(animal = c("pb1", "pb1", "pb1", "pb1", "pb2",
"pb2"), azimuth = c(187.561, 147.219, 71.032, 119.502, 101.702,
85.715), south = c(1L, 1L, 0L, 1L, 1L, 0L), distance = c(1.992,
8.567, 5.754, 10.451, 9.227, 8.821)), class = "data.frame",
row.names = c(NA, -6L))
library(data.table)
setDT(df)
df[, cumdist := south*cumsum(distance), .(animal, rleid(south))]
# animal azimuth south distance cumdist
# 1: pb1 187.561 1 1.992 1.992
# 2: pb1 147.219 1 8.567 10.559
# 3: pb1 71.032 0 5.754 0.000
# 4: pb1 119.502 1 10.451 10.451
# 5: pb2 101.702 1 9.227 9.227
# 6: pb2 85.715 0 8.821 0.000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With