Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use str_detect() in combination with the & operator?

When using str_dectect() you can use the | operator like so...

example_df <- data.frame(
   letters = c("A B C", "C B E", "F C B", "A D E", "F G C")
)

example_df %>% filter(str_detect(letters, "B|C"))

And it will return all rows except the fourth (where letters = "A D E").

I want to do the same with str_detect() but looking for a combination of letters.

I imagined you could just replace the | operator with the & operator and the following would return all rows except the last two.

example_df <- data.frame(
   letters = c("A B C", "C B E", "F C B", "A D E", "F G C")
)

example_df %>% filter(str_detect(letters, "B&C"))

However, this doesn't work. Does anyone know how I can make this work using str_detect or another tidyverse method (I can get it to work with grepl but need to find a tidyverse solution).

like image 683
Tom Avatar asked Dec 02 '25 22:12

Tom


1 Answers

You can do it using Perl-style "non-consuming lookahead":

example_df <- data.frame(
  letters = c("A B C", "C B E", "F C B", "A D E", "F G C", "B B E")
)

library(tidyverse)

example_df %>% filter(str_detect(letters, "(?=.*B)(?=.*C)"))
#>   letters
#> 1   A B C
#> 2   C B E
#> 3   F C B

Created on 2022-03-23 by the reprex package (v2.0.1)

This looks for anything followed by B, but doesn't advance; then it looks for anything followed by C. That's accepted by default in str_detect, but if you wanted to do the same sort of thing in base R functions, you'd need the perl = TRUE option, e.g.

grep("(?=.*B)(?=.*C)", example_df$letters, perl = TRUE, value = TRUE)
like image 116
user2554330 Avatar answered Dec 04 '25 11:12

user2554330