I used the following function to find the exact match for words in a string.
def exact_Match(str1, word):
result = re.findall('\\b'+word+'\\b', str1, flags=re.IGNORECASE)
if len(result)>0:
return True
else:
return False
exact_Match(str1, word)
But I get an exact match for both words "award" and "award-winning" when it only should be award-winning for the following string.
str1 = "award-winning blueberries"
word1 = "award"
word2 = "award-winning"
How can i get it such that re.findall will match whole words with hyphens and other punctuations?
Make your own word-boundary:
def exact_Match(phrase, word):
b = r'(\s|^|$)'
res = re.match(b + word + b, phrase, flags=re.IGNORECASE)
return bool(res)
copy-paste from here to my interpreter:
>>> str1 = "award-winning blueberries"
>>> word1 = "award"
>>> word2 = "award-winning"
>>> exact_Match(str1, word1)
False
>>> exact_Match(str1, word2)
True
Actually, the casting to bool is unnecessary and not helping at all. The function is better off without it:
def exact_Match(phrase, word):
b = r'(\s|^|$)'
return re.match(b + word + b, phrase, flags=re.IGNORECASE)
note: exact_Match is pretty unconventional casing. just call it exact_match.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With