Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Perl be "statically" parsed?

An article called "Perl cannot be parsed, a formal proof" is doing the rounds. So, does Perl decide the meaning of its parsed code at "run-time" or "compile-time"?

In some discussions I've read, I get the impression the arguments stem from imprecise terminology, so please try to define your technical terms in your answer. I have deliberately not defined "run-time", "statically" or "parsed" so that I can get perspectives from people who perhaps define those terms differently to me.

Edit:

This isn't about static analysis. Its a theoretical question about Perl's behaviour.

like image 522
Paul Biggar Avatar asked Aug 14 '09 23:08

Paul Biggar


3 Answers

Perl has a well-defined "compile time" phase, which is followed by a well-defined "runtime" phase. However, there are ways of transitioning from one to the other. Many dynamic languages have eval constructs that allow compilation of new code during the runtime phase; in Perl the inverse is possible as well -- and common. BEGIN blocks (and the implicit BEGIN block caused by use) invoke a temporary runtime phase during compile-time. A BEGIN block is executed as soon as it's compiled, instead of waiting for the rest of the compilation unit (i.e. current file or current eval) to compile. Since BEGINs run before the code that follows them is compiled, they can influence the compilation of the following code in practically any way (although in practice the main things they do are to import or define subroutines, or to enable strictness or warnings).

A use Foo; is basically equivalent to BEGIN { require foo; foo->import(); }, with require being (like eval STRING) one of the ways to invoke compile-time from runtime, meaning that we're now within compile-time within runtime within compile-time and the whole thing is recursive.

Anyway, what it boils down to for the decidability of parsing Perl is that since the compilation of one bit of code can be influenced by the execution of a preceding piece of code (which can in theory do anything), we've got ourselves a halting-problem type situation; the only way to correctly parse a given Perl file in general is by executing it.

like image 149
hobbs Avatar answered Nov 11 '22 20:11

hobbs


Perl has BEGIN blocks, which runs user Perl code at compile-time. This code can affect the meaning of other code to be compiled, thus making it "impossible" to parse Perl.

For example, the code:

sub foo { return "OH HAI" }

is "really":

BEGIN {
    *{"${package}::foo"} = sub { return "OH HAI" };
}

That means that someone could write Perl like:

BEGIN {
    print "Hi user, type the code for foo: ";
    my $code = <>;
    *{"${package}::foo"} = eval $code;
}

Obviously, no static analysis tool can guess what code the user is going to type in here. (And if the user says sub ($) {} instead of sub {}, it will even affect how calls to foo are interpreted throughout the rest of the program, potentially throwing off the parsing.)

The good news is that the impossible cases are very corner-casey; technically possible, but almost certainly useless in real code. So if you are writing a static analysis tool, this will probably cause you no trouble.

To be fair, every language worth its salt has this problem, or something similar. As an example, throw your favorite code walker at this Lisp code:

(iter (for i from 1 to 10) (collect i))

You probably can't predict that this is a loop that produces a list, because the iter macro is opaque and would require special knowledge to understand. The reality is that this is annoying in theory (I can't understand my code without running it, or at least running the iter macro, which may not ever stop running with this input), but very useful in practice (iteration is easy for the programmer to write and the future programmer to read).

Finally, a lot of people think that Perl lacks static analysis and refactoring tools, like Java has, because of the relative difficulty in parsing it. I doubt this is true, I just think the need is not there and nobody has bothered to write it. (People do need a "lint", so there is Perl::Critic, for example.)

Any static analysis I have needed to do of Perl to generate code (some emacs macros for maintaining test counters and Makefile.PL) has worked fine. Could weird corner cases throw off my code? Of course, but I don't go out of my way to write code that's impossible to maintain, even though I could.

like image 29
jrockway Avatar answered Nov 11 '22 21:11

jrockway


People have used a lot of words to explain various phases, but it's really a simple matter. While compiling Perl source, the perl intrepreter may end up running code that changes how the rest of the code will parse. Static analysis, which runs no code, will miss this.

In that Perlmonks post, Jeffrey talks about his articles in The Perl Review that go into much more detail, including a sample program that doesn't parse the same way every time you run it.

like image 5
brian d foy Avatar answered Nov 11 '22 21:11

brian d foy