Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I fix "invalid group" error when attempting to use Gruber's "improved" URL matching regexp pattern in JavaScript?

I'm attempting to integrate John Gruber's An Improved Liberal, Accurate Regex Pattern for Matching URLs into one of my Javascripts, but WebKit's inspector (in Google Chrome 5.0.375.125 for Mac) gives an "Invalid group" regular expression syntax error.

Gruber's original regexp is as follows:

(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))

The line from my JavaScript w/the regexp is as follows (w/forward slashes backslash-escaped):

tweet_text = tweet_text.replace(/(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi, '<a href="$1">$1</a>');

And the Google Chrome (V8?) error is as follows:

Uncaught SyntaxError: Invalid regular expression: /(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/: Invalid group

And the Safari error is as follows:

SyntaxError: Invalid regular expression: unrecognized character after (?

He claims it should work in modern JavaScript regexp interpreters, which I'd assume WebKit & V8 would be. Does JavaScript's regexp syntax not support the (?: (damn Google for not indexing punctuation!) grouping syntax? Did I just miss escaping something?

like image 508
morgant Avatar asked Aug 24 '10 17:08

morgant


1 Answers

Gah, it was the mode modifier (i.e. the (?i)) at the beginning of the regex!

I went through Regular-Expressions.info's datails on "JavaScript's Regular Expression Flavor", specifically the list of what's not supported, and there was the 'mode modifier', which I had already specified after the closing forward slash of the regex. Ripped it out an all seems well.

So, my JavaScript regex is now as follows:

/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi
like image 83
morgant Avatar answered Sep 21 '22 17:09

morgant