Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identify first match position in a string

I have a character string ("00010000") and need to identify which position do we see the first "1". (This tells me which month a customer is active)

I have a dataset that looks like this:

id  <- c(1:5)
seq <- c("00010000","00001000","01000000","10000000","00010000")
df <- data.frame(id,seq)

I would like to create a new field identifying the first_month_active for each id.

I can do this manually with a nested ifelse function:

    df$first_month_active <-
        ifelse(substr(df$seq,1,1)=="1",1,
        ifelse(substr(df$seq,2,2)=="1",2,
        ifelse(substr(df$seq,3,3)=="1",3,       
        ifelse(substr(df$seq,4,4)=="1",4,
        ifelse(substr(df$seq,5,5)=="1",5,99 ))))) 

Which gives me the desired result:

  id  seq        first_position
  1   00010000   4
  2   00001000   5
  3   01000000   2
  4   10000000   1
  5   00010000   4

However, this is not an ideal solution for my data, which contains 36 months.

I would like to use a loop with an ifelse statement, however I am really struggling with syntax

for (i in 1:36) {
ifelse(substr(df$seq,0+i,0+i)=="1",0+i,
}

Any ideas would be greatly appreciated

like image 230
Chris L Avatar asked Mar 18 '15 13:03

Chris L


2 Answers

Or try the stringi package

library(stringi)
stri_locate_first_fixed(df$seq, "1")[, 1]
## [1] 4 5 2 1 4
like image 50
David Arenburg Avatar answered Oct 07 '22 00:10

David Arenburg


Skip the loop and the ifelse:

9 - nchar(as.numeric(seq))
## [1] 4 5 2 1 4

This won't work the same in your data.frame because you coerced seq to factor implicitly, so just do:

9 - nchar(as.numeric(as.character(df$seq)))
## [1] 4 5 2 1 4

Edit: Just for fun, since Frank didn't convert his comment into an answer, here's strsplit solution:

# from original vector
sapply(strsplit(seq, "1"), nchar)[1,] + 1
## [1] 4 5 2 1 4

# from data.frame
sapply(strsplit(as.character(df$seq), "1"), nchar)[1,] + 1
## [1] 4 5 2 1 4
like image 29
Thomas Avatar answered Oct 07 '22 01:10

Thomas