I have a character string ("00010000") and need to identify which position do we see the first "1". (This tells me which month a customer is active)
I have a dataset that looks like this:
id <- c(1:5)
seq <- c("00010000","00001000","01000000","10000000","00010000")
df <- data.frame(id,seq)
I would like to create a new field identifying the first_month_active for each id.
I can do this manually with a nested ifelse function:
df$first_month_active <-
ifelse(substr(df$seq,1,1)=="1",1,
ifelse(substr(df$seq,2,2)=="1",2,
ifelse(substr(df$seq,3,3)=="1",3,
ifelse(substr(df$seq,4,4)=="1",4,
ifelse(substr(df$seq,5,5)=="1",5,99 )))))
Which gives me the desired result:
id seq first_position
1 00010000 4
2 00001000 5
3 01000000 2
4 10000000 1
5 00010000 4
However, this is not an ideal solution for my data, which contains 36 months.
I would like to use a loop with an ifelse statement, however I am really struggling with syntax
for (i in 1:36) {
ifelse(substr(df$seq,0+i,0+i)=="1",0+i,
}
Any ideas would be greatly appreciated
Or try the stringi
package
library(stringi)
stri_locate_first_fixed(df$seq, "1")[, 1]
## [1] 4 5 2 1 4
Skip the loop and the ifelse
:
9 - nchar(as.numeric(seq))
## [1] 4 5 2 1 4
This won't work the same in your data.frame because you coerced seq
to factor implicitly, so just do:
9 - nchar(as.numeric(as.character(df$seq)))
## [1] 4 5 2 1 4
Edit: Just for fun, since Frank didn't convert his comment into an answer, here's strsplit
solution:
# from original vector
sapply(strsplit(seq, "1"), nchar)[1,] + 1
## [1] 4 5 2 1 4
# from data.frame
sapply(strsplit(as.character(df$seq), "1"), nchar)[1,] + 1
## [1] 4 5 2 1 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With