Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are Perl regexes turing complete?

I've seen Ruby and Perl programmers do some complicated code challenges entirely with regexes. The lookahead and lookbehind capabilities in Perl regexes make them more powerful than the regex implementations in most other languages. I was wondering how powerful they really are.

Is there an easy way to either prove or disprove that Perl regexes are Turing complete?

like image 589
Peter Olson Avatar asked Nov 02 '11 15:11

Peter Olson


People also ask

Is Turing a regex?

Programming Languages are typically defined as languages that are Turing Complete. Such languages must be able to process any computable function. Regex does not fit into this category.

Why is Perl good for regex?

In general, Perl uses a backtrack regex engine. Such an engine is flexible, easy to implement and very fast on a subset of regex. However, for other types of regex, for example when there is the | operator, it may become very slow. In the extreme case, its match speed is exponential in the length of the pattern.


2 Answers

Excluding any kind of embedded code, such as ?{ }, they probably don't cover all of context-free, much less Turing Machines. They might, but to my knowledge, nobody has actually proven it one way or another. Given that people have been trying to solve certain context-free problems with Perl regexes for a while and haven't come up with a solution yet, it's likely that they are not context-free.

There is an interesting discussion to be had about what features are merely convenient, and which actually add power. For instance, matching 0n*1*0n (that's notation for "any number of zeros, followed by a one, followed by the same number of zeros as before") is not something that can be done with pure regexes. You can prove this can't be done with regexes using the Pumping Lemma, but the simple, informal proof is that the regex would have to count an arbitrary number of zeros, and regexes can't do counting.

However, backreferences can match that with:

/(0*) 1 \1/x; 

So that means backreferences give you more power, and are not a mere convenience. What else might give us more power, I wonder?

Also, Perl6 "patterns" (they're not even pretending they're regexes anymore) are designed to look kinda like Perl5 regexes (so you don't need to relearn much), but they have enough features added to be fully context-free. They're actually designed so you can use them to alter the way the language is parsed within a lexical scope.

like image 114
frezik Avatar answered Sep 20 '22 17:09

frezik


There are at least two discussions: Turing completeness and regular expressions and Are Perl patterns universal? with further references.

The consensus (to my untrained eye) seems to be that the answer is "no", but I am not sure if I understand everything correctly.

like image 40
Sinan Ünür Avatar answered Sep 20 '22 17:09

Sinan Ünür