Consider following MWE:
df <- data.frame(Day=1:10, Value = c("Yes","No","Yes", "Yes", "Yes",
"No", "No", "Yes","Yes", "No"))
Day Value
1 Yes
2 No
3 Yes
4 Yes
5 Yes
6 No
7 No
8 Yes
9 Yes
10 No
I want an extra column that counts the number of times 'Value' is is already continuously 'yes'. So when Value is 'No', the new variable should always be 0. If it is the first time 'Yes' appears after 'No', it is set to 1. If then the next observations is also yes, it should be 2. As soon as the chain of 'Yes' is intermittent, the new variable for the next 'yes' will be 1 again. So my data frame should look like this:
Day Value Count
1 Yes 1
2 No 0
3 Yes 1
4 Yes 2
5 Yes 3
6 No 0
7 No 0
8 Yes 1
9 Yes 2
10 No 0
Hope someone can help me out.
You can try using "data.table", specifically the rleid
function:
Example:
library(data.table)
as.data.table(df)[, count := sequence(.N), by = rleid(Value)][Value == "No", count := 0][]
# Day Value count
# 1: 1 Yes 1
# 2: 2 No 0
# 3: 3 Yes 1
# 4: 4 Yes 2
# 5: 5 Yes 3
# 6: 6 No 0
# 7: 7 No 0
# 8: 8 Yes 1
# 9: 9 Yes 2
# 10: 10 No 0
We can use base R
as well. We create a grouping variable ('grp') by comparing the adjacent elements of 'Value' column and cumsum
the logical index. Then, this can be used in ave
to create the sequence.
grp <- with(df, cumsum(c(TRUE,Value[-1L]!=Value[-length(Value)])))
df$count <- ave(seq_along(df$Value), grp, FUN=seq_along)*(df$Value=='Yes')
df$count
#[1] 1 0 1 2 3 0 0 1 2 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With