I'm having a problem with a seemingly simple Python regular expression.
# e.g. If I wanted to find "mark has wonderful kittens, but they're mischievous.."
p = re.compile("*kittens*")
This will fail with the error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib64/python2.7/re.py", line 190, in compile
return _compile(pattern, flags)
File "/usr/lib64/python2.7/re.py", line 242, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
I'm probably missing something quite simple, regular expressions are certainly not in my strengths!
Both return the first match of a substring found in the string, but re. match() searches only from the beginning of the string and return match object if found. But if a match of substring is found somewhere in the middle of the string, it returns none.
If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.
You can use re. escape() : re. escape(string) Return string with all non-alphanumerics backslashed; this is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it.
You're confusing regular expressions with globs.
You mean:
p = re.compile(".*kittens.*")
Note that a bare asterisk doesn't mean the same in an RE as it does in a glob expression.
*
is a metacharacter, meaning "0 or more of the preceding token", and there is nothing to repeat for the first *
.
Perhaps you're looking for word boundaries:
p = re.compile(r"\bkittens\b")
\b
ensures that only entire words are matched (so this regex would fail on, ahem, "kittenshit"
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With