Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check if a string in one element of a dataframe row is in another element

Tags:

r

grepl

I want to see how many email addresses contain the last name of the email's owner.

Each row in a dataframe contains a last name and an email address. I want to add a third column with a "yes" or a "no" indicating the presence of the last name in the email on that row.

Using a for loop works fine...but I can't help thinking there's probably a better R solution. Any suggestions on how make this more elegant?

vec1 <- c("foo", "smith")
vec2 <- c("[email protected]", "[email protected]")

df <- data.frame(vec1,vec2)


for(i in 1:nrow(df)) {
  if (grepl(df$vec1[i], df$vec2[i]) == TRUE) {
    df$lastNameInEmail[i] <- "Yes"
  } else {
    df$lastNameInEmail[i] <- "No"
  }
}

   vec1       vec2 lastNameInEmail
1   foo [email protected]             Yes
2 smith  [email protected]              No
like image 521
Mike N Avatar asked Nov 22 '25 12:11

Mike N


2 Answers

You can using stringr str_detect

stringr::str_detect(vec2,paste(vec1,collapse = '|'))
[1]  TRUE FALSE
like image 104
BENY Avatar answered Nov 25 '25 04:11

BENY


Here is a version using base R functions which works for more than the two given rows:

vec1 <- c("foo", "smith", "jones", "bar")
vec2 <- c("[email protected]", "[email protected]", "[email protected]", "[email protected]")

df <- data.frame(vec1,vec2)

df$lastNameInEmail <- sapply(1:nrow(df), function(x){ifelse(grepl(df$vec1[x], df$vec2[x])==TRUE, "Yes", "No")})
df
    vec1       vec2 lastNameInEmail
1:   foo [email protected]             Yes
2: smith  [email protected]              No
3: jones  [email protected]              No
4:   bar [email protected]             Yes
like image 34
makeyourownmaker Avatar answered Nov 25 '25 04:11

makeyourownmaker