Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which regular expression algorithm does PHP use?

After reading this article about two different types of regular expression algorithms (Perl 5.8.7 and Thompson NFA), the latter being ~1,000,000 times faster than the former, according to the article. I use PHP daily, and use regex quite a lot, so I wanted to know which algorithm PHP uses.

I found this question, however it's only for JavaScript. One of the answers states that JavaScript uses the Thompson NFA algorithm, but that will of course vary from implementation to implementation. I think PHP may have switched to using the faster algorithms when it moved to it's PCRE set of functions, deprecating the ereg_* stuff.

I've looked at the PHP PCRE documentation and, as far as I could see, it tells me nothing as to what algorithm it uses. The acronym PCRE, to me, tells me that it uses Perl Compatible Regular Expressions, so I assume it uses the Perl style algorithm.

Which regular expression algorithm does PHP use? Is it "Perl 5.8.7 style", or does it use the much faster Thompson NFA algorithm, or another one entirely? Could it even use a Perl backend to run it's expressions?

If PHP does use a Perl style algorithm, what exactly is it? I'm looking for an abstract definition/explanation in relation to other algorithms.

like image 929
Bojangles Avatar asked Apr 18 '12 22:04

Bojangles


1 Answers

From the manual:

http://www.php.net/pcre:

Regular Expressions (Perl-Compatible)

http://www.php.net/manual/en/intro.pcre.php:

The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5, with just a few differences (see below). The current implementation corresponds to Perl 5.005.

like image 197
Jonathan Kuhn Avatar answered Oct 07 '22 22:10

Jonathan Kuhn