Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use grepl to search either of multiple substrings in a text [duplicate]

Tags:

regex

r

grepl

I am using grepl() in R to search if either of the following genres exist in my text. I am doing it like this right now:

grepl("Action", my_text) |
grepl("Adventure", my_text) |
grepl("Animation", my_text) |
grepl("Biography", my_text) |
grepl("Comedy", my_text) |
grepl("Crime", my_text) |
grepl("Documentary", my_text) |
grepl("Drama", my_text) |
grepl("Family", my_text) |
grepl("Fantasy", my_text) |
grepl("Film-Noir", my_text) |
grepl("History", my_text) |
grepl("Horror", my_text) |
grepl("Music", my_text) |
grepl("Musical", my_text) |
grepl("Mystery", my_text) |
grepl("Romance", my_text) |
grepl("Sci-Fi", my_text) |
grepl("Sport", my_text) |
grepl("Thriller", my_text) |
grepl("War", my_text) |
grepl("Western", my_text)

Is there a better way to write this code? Can I put all the genres in an array and then somehow use grepl() on that?

like image 649
user3422637 Avatar asked Oct 11 '14 21:10

user3422637


People also ask

How to check multiple patterns in grep?

If you want to find exact matches for multiple patterns, pass the -w flag to the grep command. As you can see, the results are different. The first command shows all lines with the strings you used. The second command shows how to grep exact matches for multiple strings.

What is the difference between grep and Grepl?

Both functions allow you to see whether a certain pattern exists in a character string, but they return different results: grepl() returns TRUE when a pattern exists in a character string. grep() returns a vector of indices of the character strings that contain the pattern.

What does Grepl stand for?

The grepl() stands for “grep logical”. In R it is a built-in function that searches for matches of a string or string vector. The grepl() method takes a pattern and data and returns TRUE if a string contains the pattern, otherwise FALSE.


1 Answers

You could paste the genres together with an "or" | separator and run that through grepl as a single regular expression.

x <- c("Action", "Adventure", "Animation", ...)
grepl(paste(x, collapse = "|"), my_text)

Here's an example.

x <- c("Action", "Adventure", "Animation")
my_text <- c("This one has Animation.", "This has none.", "Here is Adventure.")
grepl(paste(x, collapse = "|"), my_text)
# [1]  TRUE FALSE  TRUE
like image 71
Rich Scriven Avatar answered Sep 28 '22 11:09

Rich Scriven