Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Perl's glob() function always return a file name when given a string with no globbing characters?

Tags:

glob

perl

I gave a list of globs and one string to Perl's glob function. The globs were treated as expected but the string is always found. For example:

$ ls
foo
$ perl -le '@files=glob("*bar"); print @files' ## prints nothing, as expected
$ perl -le '@files=glob("bar"); print @files'
bar

As you can see above, the second example prints bar even though no such file exists.

My first thought is that it behaves like the shell in that when no expansion is available, a glob (or something being treated as a glob) expands to itself. For example, in csh (awful as it is, this is what Perl's glob() function seems to be following, see the quote below):

% foreach n (*bar*)
foreach: No match.

% foreach n (bar)
foreach? echo $n
foreach? end
bar                     ## prints the string

However, according to the docs, glob should return filename expansions (emphasis mine):

In list context, returns a (possibly empty) list of filename expansions on the value of EXPR such as the standard Unix shell /bin/csh would do.

So why is it returning itself when there are no globbing characters in the argument passed to glob? Is this a bug or am I doing something wrong?

like image 321
terdon Avatar asked Jul 06 '16 17:07

terdon


2 Answers

I guess I expected Perl to be checking for file existence in the background.

Perl is checking for file existence:

$ strace perl -e'glob "foo"' 2>&1 | grep foo
execve("/home/mcarey/perl5/perlbrew/perls/5.24.0-debug/bin/perl", ["perl", "-eglob \"foo\""], [/* 39 vars */]) = 0
lstat("foo", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0

So why is it returning itself when there are no globbing characters in the argument passed to glob?

Because that's what csh does. Perl's implementation of glob is based on glob(3) with the GLOB_NOMAGIC flag enabled:

GLOB_NOMAGIC

Is the same as GLOB_NOCHECK but it only appends the pattern if it does not contain any of the special characters *, ? or [. GLOB_NOMAGIC is provided to simplify implementing the historic csh(1) globbing behavior and should probably not be used anywhere else.

GLOB_NOCHECK

If pattern does not match any pathname, then glob() returns a list consisting of only pattern...

So, for a pattern like foo with no wildcards:

  • if a matching file exists, the filename expansion (foo) is returned
  • if no matching file exists, the pattern (foo) is returned

Since the filename expansion is the same as the pattern,

glob 'foo'

in list context will always return a list with the single element foo, whether the file foo exists or not.

like image 102
ThisSuitIsBlackNot Avatar answered Sep 30 '22 09:09

ThisSuitIsBlackNot


When you use ? or * or [], only existing files or directories will be returned. When your pattern just has literal text or {}, all possible results will be returned. This exactly matches what csh does.

Often, people will do @results = grep -e, glob PATTERN because of this.

Or you can use File::Glob::bsd_glob if you want more control over this. (Note that there is no additional overhead to doing this; since perl 5.6 when you use glob() perl quietly loads File::Glob and uses it.)

like image 42
ysth Avatar answered Sep 30 '22 07:09

ysth