When I try to use regular expression for finding strings in other strings, it does not work as expected. Here is an example: <pre class="prettyprint"><code>import re message = 'I really like beer, but my favourite beer is German beer.' keywords = ['beer', 'german beer', 'german'] regex = re.compile("|".join(keywords)) regex.findall(message.lower()) </code></pre> Result: <pre class="prettyprint"><code>['beer', 'beer', 'german beer'] </code></pre> But the expected result would be: <pre class="prettyprint"><code>['beer', 'beer', 'german beer', 'german'] </code></pre> Another way to do that could be: <pre class="prettyprint"><code>results = [] for k in keywords: regex = re.compile(k) for r in regex.findall(message.lower()): results.append(r) ['beer', 'beer', 'beer', 'german beer', 'german'] </code></pre> It works like I want, but I think it is not the best way to do that. Can somebody help me?

<code>re.findall</code> is described in http://docs.python.org/2/library/re.html "Return all non-overlapping matches of pattern in string..." Non-overlapping means that for "german beer" it will not find "german beer" AND "german", because those matches are overlapping.

Search strings using regular expression in Python

When I try to use regular expression for finding strings in other strings, it does not work as expected. Here is an example:

import re
message = 'I really like beer, but my favourite beer is German beer.'
keywords = ['beer', 'german beer', 'german']

regex = re.compile("|".join(keywords))
regex.findall(message.lower())

Result:

['beer', 'beer', 'german beer']

But the expected result would be:

['beer', 'beer', 'german beer', 'german']

Another way to do that could be:

results = []
for k in keywords:
    regex = re.compile(k)
    for r in regex.findall(message.lower()):
        results.append(r)

['beer', 'beer', 'beer', 'german beer', 'german']

It works like I want, but I think it is not the best way to do that. Can somebody help me?

What is RegEx string in Python?

A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. RegEx can be used to check if a string contains the specified search pattern.

How do you use regular expressions in Python?

Python has a module named re to work with RegEx. Here's an example: import re pattern = '^a...s$' test_string = 'abyss' result = re. match(pattern, test_string) if result: print("Search successful.") else: print("Search unsuccessful.")

re.findall cannot find overlapping matches. If you want to use regular expressions you will have to create separate expressions and run them in a loop as in your second example.

Note that your second example can also be shortened to the following, though it's a matter of taste whether you find this more readable:

results = [r for k in keywords for r in re.findall(k, message.lower())]

Your specific example doesn't require the use of regular expressions. You should avoid using regular expressions if you just want to find fixed strings.

re.findall is described in http://docs.python.org/2/library/re.html

"Return all non-overlapping matches of pattern in string..."

Non-overlapping means that for "german beer" it will not find "german beer" AND "german", because those matches are overlapping.

Search strings using regular expression in Python

Tags:

python

string

regex

find

Adrian

People also ask

2 Answers

Mark Byers

Omri Barel

Recent Activity

Donate For Us

Search strings using regular expression in Python

Tags:

python

string

regex

find

Adrian

People also ask

2 Answers

Mark Byers

Omri Barel

Related questions

Recent Activity

Donate For Us