Python multiple repeat Error

Tags:

2 Answers

The problem is that, in a non-raw string, \" is ".

You get lucky with all of your other unescaped backslashes—\s is the same as \\s, not s; \( is the same as \\(, not (, and so on. But you should never rely on getting lucky, or assuming that you know the whole list of Python escape sequences by heart.

Either print out your string and escape the backslashes that get lost (bad), escape all of your backslashes (OK), or just use raw strings in the first place (best).

That being said, your regexp as posted won't match some expressions that it should, but it will never raise that "multiple repeat" error. Clearly, your actual code is different from the code you've shown us, and it's impossible to debug code we can't see.

Now that you've shown a real reproducible test case, that's a separate problem.

You're searching for terms that may have special regexp characters in them, like this:

term = 'lg incite" OR author:"http++www.dealitem.com" OR "for sale'

That p++ in the middle of a regexp means "1 or more of 1 or more of the letter p" (in the others, the same as "1 or more of the letter p") in some regexp languages, "always fail" in others, and "raise an exception" in others. Python's re falls into the last group. In fact, you can test this in isolation:

>>> re.compile('p++') error: multiple repeat

If you want to put random strings into a regexp, you need to call re.escape on them.

One more problem (thanks to Ωmega):

. in a regexp means "any character". So, ,|.|;|:" (I've just extracted a short fragment of your longer alternation chain) means "a comma, or any character, or a semicolon, or a colon"… which is the same as "any character". You probably wanted to escape the ..

Putting all three fixes together:

term = 'lg incite" OR author:"http++www.dealitem.com" OR "for sale' regexPart1 = r"\s" regexPart2 = r"(?:s|'s|!+|,|\.|;|:|\(|\)|\"|\?+)?\s"   p = re.compile(regexPart1 + re.escape(term) + regexPart2 , re.IGNORECASE)

As Ωmega also pointed out in a comment, you don't need to use a chain of alternations if they're all one character long; a character class will do just as well, more concisely and more readably.

And I'm sure there are other ways this could be improved.

answered Sep 28 '22 15:09

abarnert

The other answer is great, but I would like to point out that using regular expressions to find strings in other strings is not the best way to go about it. In python simply write:

    if term in string:          #do whatever

answered Sep 28 '22 17:09

Patrick

Related questions
                            
                                How to use python 3 as a build script in non-python travis configuration?
                            
                                What does pip install . (dot) mean?
                            
                                What is the purpose of the c flag in the "conda install" command
                            
                                For Python programmers, is there anything equivalent to Perl's CPAN?
                            
                                Compare dictionaries ignoring specific keys
                            
                                Pyusb on windows - no backend available
                            
                                easyprocess.EasyProcessCheckInstalledError: cmd=['Xvfb', '-help'] OSError=[Errno 2] No such file or directory
                            
                                Why does the shape of a 1D array not show the number of rows as 1?
                            
                                How to use dash within Jupyter notebook or JupyterLab?
                            
                                How to write the Visitor Pattern for Abstract Syntax Tree in Python?
                            
                                ImportError: No module named statsmodels
                            
                                xlsxwriter: is there a way to open an existing worksheet in my workbook?
                            
                                pandas - Extend Index of a DataFrame setting all columns for new rows to NaN?
                            
                                What is the necessity of plt.figure() in matplotlib?
                            
                                Pandas Apply Key Error
                            
                                How do I parse a yaml string with python?
                            
                                pandas pd.options.display.max_rows not working as expected
                            
                                C++ GDB Python Pretty Printing Tutorial?
                            
                                getting the opposite diagonal of a numpy array
                            
                                How to convert a string to an image?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python multiple repeat Error

Tags:

python

regex

Presen

People also ask

2 Answers

abarnert

Patrick

Recent Activity

Donate For Us