Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Perl avoid shebang loops?

Tags:

perl

perl interprets the shebang itself and mimics the behavior of exec*(2). I think it emulates the Linux behavior of splitting on all whitespace instead of BSD first-whitespace-only thing, but never mind that.

Just as a quick demonstration really_python.pl

#!/usr/bin/env python

# the following line is correct Python but not correct Perl
from collections import namedtuple
print "hi"

prints hi when invoked as perl really_python.pl.

Also, the following programs will do the right thing regardless of whether they are invoked as perl program or ./program.

#!/usr/bin/perl
print "hi\n";

and

#!/usr/bin/env perl
print "hi\n";

I don't understand why the program isn't infinite looping. In either of the above cases, the shebang line either is or resolves to an absolute path to the perl interpreter. It seems like the next thing that should happen after that is perl parses the file, notices the shebang, and delegates to the shebang path (in this case itself). Does perl compare the shebang path to its own ARGV[0]? Does perl look at the shebang string and see if it contains "perl" as a substring?

I tried to use a symlink to trigger the infinite loop behavior I was expecting.

$ ln -s /usr/bin/perl /tmp/p

#!/tmp/p
print "hi\n";

but that program printed "hi" regardless of how it was invoked.

On OS X, however, I was able to trick perl into an infinite shebang loop with a script.

Contents of /tmp/pscript

#!/bin/sh
perl "$@"

Contents of perl script

#!/tmp/pscript
print "hi\n";

and this does infinite loop (on OS X, haven't tested it on Linux yet).

perl is clearly going to a lot of trouble to handle shebangs correctly in reasonable situations. It isn't confused by symlinks and isn't confused by normal env stuff. What exactly is it doing?

like image 825
Gregory Nisbet Avatar asked Jun 27 '16 17:06

Gregory Nisbet


1 Answers

The relevant code is in toke.c, the Perl lexer. If:

  • line 1 begins with #! (optionally preceded by whitespace) AND

  • does not contain perl - AND

  • does not contain perl (unless followed by a 6, i.e. perl6) AND

  • (on "DOSish" platforms) does not contain a case-insensitive match of perl (e.g. Perl) AND

  • does not contain indir AND

  • the -c flag was not set on the command line AND

  • argv[0] contains perl

the program following the shebang is executed with execv. Otherwise, the lexer just keeps going; perl doesn't exec itself.

As a result, you can do some pretty weird things with the shebang without perl trying to exec another interpreter:

    #!     perl
#!foo perl
#!fooperlbar -p
#!perl 6
#!PeRl          # on Windows

Your symlink example meets all of the conditions listed above, so why isn't there an infinite loop? You can see what's going on with strace:

$ ln -s /usr/bin/perl foo
$ echo '#!foo' > bar
$ strace perl bar 2>&1 | grep exec
execve("/bin/perl", ["perl", "bar"], [/* 27 vars */]) = 0
execve("foo", ["foo", "bar"], [/* 27 vars */]) = 0

Perl actually does exec the link, but because it doesn't contain perl in the name, the last condition is no longer met the second time around and the loop ends.

like image 89
ThisSuitIsBlackNot Avatar answered Nov 16 '22 04:11

ThisSuitIsBlackNot