Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Match multiple patterns in any order in any location in a string

Tags:

regex

r

What is the shortest way to use grep to match multiple patterns in any order, in any location in a string? Preferably using base R in one short line.

Here's an example:

I want to find all elements that contain all of these two elements in my matches vector, in any order, in any location together in the elements of my_vector, with any characters in between them within the element.

matches <- c("fe", "ve")

#                1    2    3      4        5       6       7       8      9
my_vector <- c("fv", "v", "f", "f_v_e", "fe_ve", "feve", "vefe", "fve" , "a")

# want 5, 6, 7 

I can do this:

grep(paste0("(?=.*", paste0(matches, sep = ""), ")", collapse = ""), 
     my_vector, 
     perl = TRUE)

[1] 5 6 7 

But is there a more concise method? In my example I have two elements to match, but my actual problem has several.

like image 313
Ben Avatar asked Jul 13 '16 04:07

Ben


1 Answers

An option to avoid the regex/paste would be

which(grepl(matches[1], my_vector) & grepl(matches[2],my_vector))
#[1] 5 6 7

To make it more dynamic

which(Reduce(`&`, lapply(matches, grepl, my_vector)))
#[1] 5 6 7

Or as @Jota mentioned grep can be used intersect

Reduce(intersect, lapply(matches, grep, my_vector))

If there are many elements in matches, the paste method may not work...

like image 50
akrun Avatar answered Nov 14 '22 21:11

akrun