I have tried this command in python console:
re.match('^\<.+\>([\w\s-,]+)\<.+\>$', 'Carrier-A')
and I got:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/re.py", line 141, in match
return _compile(pattern, flags).match(string)
File "/usr/lib/python2.7/re.py", line 251, in _compile
raise error, v # invalid expression
sre_constants.error: bad character range
but when I use:
re.match('^\<.+\>([\w\s,-]+)\<.+\>$', 'Carrier-A')
no error is being returned.
What is it that I should consider about character sequences?
A dash -
, when used within square brackets []
, has a special meaning: it defines a range of characters. E.g., [\s-,]
means "any character from \s
to ,
" (which is not possible). However, the dash does not have the special meaning if it is either the first or the last character in the square brackets. That's why your second regex is correct.
the character -
stands for specifying the range of characters within a character class, which works based on the ASCII number of the characters. So the left side must always have a lower ASCII number than the right side. And whenever your regex doesn't meet this criteria python will raise that error. Which in this case your range is completely meaningless, since it's \s-,
which means any character between whitespaces and comma! which is obviously wrong!
And if you want to use the hyphen character literally you have two options in python first is escaping the characters with a backslash, like [\w\s\-,]
and the second one is putting it at the leading or trailing of other characters within character-class, as you did. [\w\s,-]
Read more http://www.regular-expressions.info/charclass.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With