Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript regex invalid range in character class

I'm using a regex pattern that I got from regexlib to validate relative urls. On their site you can test the pattern to make sure it fits your needs. Everything works great on their site, as soon as I use the pattern in mine I get the error message:

Invalid range in character class

I know that this error usually means that a hyphen is mistakenly being used to represent a range and is not properly escaped. But in this case since it works on their site I'm confused why it's not working on mine.

var urlRegex = new RegExp('^(?:(?:\.\./)|/)?(?:\w(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*\w?)?(?:/\w(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*\w?)*(?:\?[^#]+)?(?:#[a-z0-9]\w*)?$', 'g');

NOTE: If you're going to test the regex from their site (using the link above) be sure to change the Regex Engine dropdown to Client-side Engine and the Engine dropdown to Javascript.

like image 215
bflemi3 Avatar asked May 15 '13 18:05

bflemi3


2 Answers

Either put - at the end or beginning of the character class or use two backslashes to do a regex escape within string

since you are using string you need to use two backslashes for each special characters..


NOTE

Check out this answer on SO which explains when to use single or double backslashes to escape special characters

like image 78
Anirudha Avatar answered Nov 14 '22 23:11

Anirudha


There is no reason to use RegExp constructor here. Just use RegExp literal:

var urlRegex = /^(?:(?:\.\.\/)|\/)?(?:\w(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*\w?)?(?:\/\w(?:[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]|(?:%\d\d))*\w?)*(?:\?[^#]+)?(?:#[a-z0-9]\w*)?$/g;
               ^           ^   ^                                                               ^                                                                                     ^

Inside RegExp literal, you just write the regex naturally, except for /, which now needs escaping, since / is used as delimiter in the RegExp literal.

In character class, ^ has special meaning at the beginning of the character class, - has special meaning in between 2 characters, and \ has special meaning, which is to escape other characters (mainly ^, -, [, ] and \) and also to specify shorthand character classes (\d, \s, \w, ...). [, ] are used as delimiters for character class, so they also have special meaning. (Actually, in JavaScript, only ] has special meaning, and you can specify [ without escaping inside character class). Other than those 5 character listed above, other characters (unless involved in an escape sequence with \) doesn't have any special meaning.

You can reduce the number of escaping \ with the information above. For ^, unless it is the only character in the character class, you can put it away from the beginning of the character class. For -, you can put it at the end of the character class.

var urlRegex = /^(?:(?:\.\.\/)|\/)?(?:\w(?:[\w`~!$=;+.^()|{}\[\]-]|(?:%\d\d))*\w?)?(?:\/\w(?:[\w`~!$=;+.^()|{}\[\]-]|(?:%\d\d))*\w?)*(?:\?[^#]+)?(?:#[a-z0-9]\w*)?$/g;

What was changed:

[\w`~!$=;\-\+\.\^\(\)\|\{\}\[\]]
[\w`~!$=;+.^()|{}\[\]-]
like image 4
nhahtdh Avatar answered Nov 14 '22 22:11

nhahtdh