Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Got `bad character range` in regex when using comma after dash but not reverse

I have tried this command in python console:

re.match('^\<.+\>([\w\s-,]+)\<.+\>$', 'Carrier-A')

and I got:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/re.py", line 141, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python2.7/re.py", line 251, in _compile
    raise error, v # invalid expression
sre_constants.error: bad character range

but when I use:

re.match('^\<.+\>([\w\s,-]+)\<.+\>$', 'Carrier-A')

no error is being returned.

What is it that I should consider about character sequences?

like image 216
Zeinab Abbasimazar Avatar asked Dec 05 '22 15:12

Zeinab Abbasimazar


2 Answers

A dash -, when used within square brackets [], has a special meaning: it defines a range of characters. E.g., [\s-,] means "any character from \s to ," (which is not possible). However, the dash does not have the special meaning if it is either the first or the last character in the square brackets. That's why your second regex is correct.

like image 61
DYZ Avatar answered Jan 08 '23 07:01

DYZ


the character - stands for specifying the range of characters within a character class, which works based on the ASCII number of the characters. So the left side must always have a lower ASCII number than the right side. And whenever your regex doesn't meet this criteria python will raise that error. Which in this case your range is completely meaningless, since it's \s-, which means any character between whitespaces and comma! which is obviously wrong!

And if you want to use the hyphen character literally you have two options in python first is escaping the characters with a backslash, like [\w\s\-,] and the second one is putting it at the leading or trailing of other characters within character-class, as you did. [\w\s,-]

Read more http://www.regular-expressions.info/charclass.html

like image 24
Mazdak Avatar answered Jan 08 '23 07:01

Mazdak