Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does /^(.+)+Q$/.test("XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX") take so long?

When I run

/^(.+)+Q$/.test("XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX")

in Chrome or IE, it takes ~10 seconds to complete. (Firefox is able to evaluate it almost instantly.)

Why does it take so long? (And why/how is Firefox able to do it so quickly?)

(Of course, I'd never run this particular regex, but I'm hitting a similar issue with the URL regex at http://daringfireball.net/2010/07/improved_regex_for_matching_urls and it seems to boil down to this, i.e. there are certain URLs which will cause the browser to lock up)

For example:

var re = /\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/i;
re.test("http://google.com/?q=(AAAAAAAAAAAAAAAAAAAAAAAAAAAAA")
like image 657
David Ingersol Avatar asked Apr 18 '12 21:04

David Ingersol


1 Answers

As indicated by thg435, it sounds like catastrophic back-tracking. There's an excellent article on this, Regular Expression Matching Can Be Simple And Fast.

It describes an efficient approach known as Thompson NFA. As noted, though, this does not support all features of modern regexes. For instance, it can't do backreferences. However, as suggested in the article:

"Even so, it would be reasonable to use Thompson's NFA simulation for most regular expressions, and only bring out backtracking when it is needed. A particularly clever implementation could combine the two, resorting to backtracking only to accommodate the backreferences."

I suspect Firefox may be doing this.

like image 111
Matthew Flaschen Avatar answered Feb 23 '23 11:02

Matthew Flaschen