I am looking for a regex expression in Python. I have a long string of text, and I have a list of substrings to do matching in the long string of text.
Example substrings in : 'table', 'e furnish' Example string :
'Today is a good day to do up the table furnishings. Lets go to the store.'
For 'table', I would like to extract 'table'. For 'e furnish', I would like to extract 'table furnishings'.
My current code is :
for item in checklist:
pattern = r"[\s](.*)" + item +"([a-z]){0,2}[\s\.]"
print pattern
matchObj = re.search(pattern, line)
if matchObj:
print "matchObj.group() : ", matchObj.group()
else:
print ("No match!!")
but I am not able to pick up whole words encapsulating the substrings. The thing is that the substrings can be single or multiple words and it might match entire words or just part of words. For those substrings with multiple words, the extracted words must be together with no other word in between.
Thank you for your help, everyone.
Method #1 : Using split() Using the split function, we can split the string into a list of words and this is the most generic and recommended method if one wished to accomplish this particular task. But the drawback is that it fails in cases the string contains punctuation marks.
fullmatch() function in Python. re. fullmatch() returns a match object if and only if the entire string matches the pattern. Otherwise, it will return None.
Using index() + loop to extract string between two substrings. In this, we get the indices of both the substrings using index(), then a loop is used to iterate within the index to find the required string between them.
You could use \w*
any amount of word characters as a joker.
\w*e furnish\w*
See demo at regex101
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With