I have a list of strings in which I want to filter for strings that contains keywords.
I want to do something like:
fruit = re.compile('apple', 'banana', 'peach', 'plum', 'pinepple', 'kiwi']
so I can then use re.search(fruit, list_of_strings) to get only the strings containing fruits, but I'm not sure how to use a list with re.compile. Any suggestions? (I'm not set on using re.compile, but I think regular expressions would be a good way to do this.)
Python's re. compile() method is used to compile a regular expression pattern provided as a string into a regex pattern object ( re. Pattern ). Later we can use this pattern object to search for a match inside different target strings using regex methods such as a re. match() or re.search() .
The re. compile() method We can combine a regular expression pattern into pattern objects, which can be used for pattern matching. It also helps to search a pattern again without rewriting it.
re. sub() function is used to replace occurrences of a particular sub-string with another sub-string. This function takes as input the following: The sub-string to replace. The sub-string to replace with.
This function only checks for a match at the beginning of the string. This means that re. match() will return the match found in the first line of the string, but not those found in any other line, in which case it will return null .
You need to turn your fruit list into the string apple|banana|peach|plum|pineapple|kiwi
so that it is a valid regex, the following should do this for you:
fruit_list = ['apple', 'banana', 'peach', 'plum', 'pineapple', 'kiwi'] fruit = re.compile('|'.join(fruit_list))
edit: As ridgerunner pointed out in comments, you will probably want to add word boundaries to the regex, otherwise the regex will match on words like plump
since they have a fruit as a substring.
fruit = re.compile(r'\b(?:%s)\b' % '|'.join(fruit_list))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With