I'm trying to execute this code :
import re pattern = r"(\w+)\*([\w\s]+)*/$" re_compiled = re.compile(pattern) results = re_compiled.search('COPRO*HORIZON 2000 HOR') print(results.groups())
But Python does not respond. The process takes 100% of the CPU and does not stop. I've tried this both on Python 2.7.1 and Python 3.2 with identical results.
Your regex runs into catastrophic backtracking because you have nested quantifiers (([...]+)*
). Since your regex requires the string to end in /
(which fails on your example), the regex engine tries all permutations of the string in the vain hope to find a matching combination. That's where it gets stuck.
To illustrate, let's assume "A*BCD"
as the input to your regex and see what happens:
(\w+)
matches A
. Good.\*
matches *
. Yay.[\w\s]+
matches BCD
. OK./
fails to match (no characters left to match). OK, let's back up one character./
fails to match D
. Hum. Let's back up some more.[\w\s]+
matches BC
, and the repeated [\w\s]+
matches D
. /
fails to match. Back up./
fails to match D
. Back up some more.[\w\s]+
matches B
, and the repeated [\w\s]+
matches CD
. /
fails to match. Back up again./
fails to match D
. Back up some more, again.[\w\s]+
matches B
, repeated [\w\s]+
matches C
, repeated [\w\s]+
matches D
? No? Let's try something else.[\w\s]+
matches BC
. Let's stop here and see what happens./
still doesn't match D
.[\w\s]+
matches B
./
doesn't match C
.(...)*
./
still doesn't match B
.Now that was a string of just three letters. Yours had about 30, trying all permutations of which would keep your computer busy until the end of days.
I suppose what you're trying to do is to get the strings before/after *
, in which case, use
pattern = r"(\w+)\*([\w\s]+)$"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With