Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is sre_constants.error: nothing to repeat

Tags:

python

regex

I'm having a problem with a seemingly simple Python regular expression.

# e.g. If I wanted to find "mark has wonderful kittens, but they're mischievous.."
p = re.compile("*kittens*")

This will fail with the error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/re.py", line 190, in compile
    return _compile(pattern, flags)
  File "/usr/lib64/python2.7/re.py", line 242, in _compile
    raise error, v # invalid expression
sre_constants.error: nothing to repeat

I'm probably missing something quite simple, regular expressions are certainly not in my strengths!

like image 968
Ricky Hewitt Avatar asked Sep 12 '12 14:09

Ricky Hewitt


People also ask

What is re match in Python?

Both return the first match of a substring found in the string, but re. match() searches only from the beginning of the string and return match object if found. But if a match of substring is found somewhere in the middle of the string, it returns none.

How do you're sub in Python?

If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.

What is re escape in Python?

You can use re. escape() : re. escape(string) Return string with all non-alphanumerics backslashed; this is useful if you want to match an arbitrary literal string that may have regular expression metacharacters in it.


2 Answers

You're confusing regular expressions with globs.

You mean:

p = re.compile(".*kittens.*")

Note that a bare asterisk doesn't mean the same in an RE as it does in a glob expression.

like image 143
unwind Avatar answered Oct 17 '22 14:10

unwind


* is a metacharacter, meaning "0 or more of the preceding token", and there is nothing to repeat for the first *.

Perhaps you're looking for word boundaries:

p = re.compile(r"\bkittens\b")

\b ensures that only entire words are matched (so this regex would fail on, ahem, "kittenshit")

like image 38
Tim Pietzcker Avatar answered Oct 17 '22 12:10

Tim Pietzcker