Consider following MWE:
df <- data.frame(Day=1:10, Value = c("Yes","No","Yes", "Yes", "Yes",
"No", "No", "Yes","Yes", "No"))
Day Value
1 Yes
2 No
3 Yes
4 Yes
5 Yes
6 No
7 No
8 Yes
9 Yes
10 No
I want an extra column that counts the number of times 'Value' is is already continuously 'yes'. So when Value is 'No', the new variable should always be 0. If it is the first time 'Yes' appears after 'No', it is set to 1. If then the next observations is also yes, it should be 2. As soon as the chain of 'Yes' is intermittent, the new variable for the next 'yes' will be 1 again. So my data frame should look like this:
Day Value Count
1 Yes 1
2 No 0
3 Yes 1
4 Yes 2
5 Yes 3
6 No 0
7 No 0
8 Yes 1
9 Yes 2
10 No 0
Hope someone can help me out.
You can try using "data.table", specifically the rleid function:
Example:
library(data.table)
as.data.table(df)[, count := sequence(.N), by = rleid(Value)][Value == "No", count := 0][]
# Day Value count
# 1: 1 Yes 1
# 2: 2 No 0
# 3: 3 Yes 1
# 4: 4 Yes 2
# 5: 5 Yes 3
# 6: 6 No 0
# 7: 7 No 0
# 8: 8 Yes 1
# 9: 9 Yes 2
# 10: 10 No 0
We can use base R as well. We create a grouping variable ('grp') by comparing the adjacent elements of 'Value' column and cumsum the logical index. Then, this can be used in ave to create the sequence.
grp <- with(df, cumsum(c(TRUE,Value[-1L]!=Value[-length(Value)])))
df$count <- ave(seq_along(df$Value), grp, FUN=seq_along)*(df$Value=='Yes')
df$count
#[1] 1 0 1 2 3 0 0 1 2 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With