Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding files with Perl

Tags:

find

perl

File::Find and the wanted subroutine

This question is much simpler than the original title ("prototypes and forward declaration of subroutines"!) lets on. I'm hoping the answer, however simple, will help me understand subroutines/functions, prototypes and scoping and the File::Find module.

With Perl, subroutines can appear pretty much anywhere and you normally don't need to make forward declarations (except if the sub declares a prototype, which I'm not sure how to do in a "standard" way in Perl). For what I usually do with Perl there's little difference between these different ways of running somefunction:

sub somefunction;  # Forward declares the function
&somefunction; 
somefunction();
somefunction;   # Bare word warning under `strict subs`

I often use find2perl to generate code which I crib/hack into parts of scripts. This could well be bad style and now my dirty laundry is public, but so be it :-) For File::Find the wanted function is a required subroutine - find2perl creates it and adds sub wanted; to the resulting script it creates. Sometimes, when I edit the script I'll remove the "sub" from sub wanted and it ends up as &wanted; or wanted();. But without the sub wanted; forward declaration form I get this warning:

Use of uninitialized value $_ in lstat at findscript.pl line 29

My question is: why does this happen and is it a real problem? It is "just a warning", but I want to understand it better.

  • The documentation and code say $_ is localized inside of sub wanted {}. Why would it be undefined if I use wanted(); instead of sub wanted;?
  • Is wanted using prototypes somewhere? Am I missing something obvious in Find/File.pm?
  • Is it because wanted returns a code reference? (???)

My guess is that the forward declaration form "initializes" wanted in some way so that the first use doesn't have an empty default variable. I guess this would be how prototypes - even Perl prototypes, such as they exist - would work as well. I tried grepping through the Perl source code to get a sense of what sub is doing when a function is called using sub function instead of function(), but that may be beyond me at this point.

Any help deepening (and speeding up) my understanding of this is much appreciated.

EDIT: Here's a recent example script here on Stack Overflow that I created using find2perl's output. If you remove the sub from sub wanted; you should get the same error.

EDIT: As I noted in a comment below (but I'll flag it here too): for several months I've been using Path::Iterator::Rule instead of File::Find. It requires perl >5.10, but I never have to deploy production code at sites with odd, "never upgrade", 5.8.* only policies so Path::Iterator::Rule has become one of those modules I never want to do with out. Also useful is Path::Class. Cheers.

like image 925
G. Cito Avatar asked Jul 19 '13 20:07

G. Cito


People also ask

What is Opendir in Perl?

opendir is a function to open a directory in Perl. It is used with two more variables, such as DIRHANDLE, EXPR; the first one stands for directory handle, and the second one stands for expression. Also, this function returns us a Boolean value depends upon the result we get.


2 Answers

I'm not a big fan of File::Find. It just doesn't work right. The find command doesn't return a list of files, so you either have to use a non-local array variable in your find to capture your list of files you've found (not good), or place your entire program in your wanted subroutine (even worse). Plus, the separate subroutine means that your logic is separate from your find command. It's just ugly.

What I do is inline my wanted subroutine inside my find command. Subroutine stays with the find. Plus, my non-local array variable is now just part of my find command and doesn't look so bad

Here's how I handle the File::Find -- assuming I want files that have a .pl suffix:

my @file_list;
find ( sub {
    return unless -f;       #Must be a file
    return unless /\.pl$/;  #Must end with `.pl` suffix
    push @file_list, $File::Find::name;
}, $directory );

# At this point, @file_list contains all of the files I found.

This is exactly the same as:

my @file_list;

find ( \&wanted, $directory );

sub wanted {
    return unless -f;
    return unless /\.pl$/;
    push @file_list, $File::Find::name;
}

# At this point, @file_list contains all of the files I found.

In lining just looks nicer. And, it keep my code together. Plus, my non-local array variable doesn't look so freaky.

I also like taking advantage of the shorter syntax in this particular way. Normally, I don't like using the inferred $_, but in this case, it makes the code much easier to read. My original Wanted is the same as this:

sub wanted {
    my $file_name = $_;
    if ( -f $file_name and $file_name =~ /\.pl$/ ) {
        push @file_list, $File::Find::name;
    }
}

File::Find isn't that tricky to use. You just have to remember:

  • When you find a file you don't want, you use return to go to the next file.
  • $_ contains the file name without the directory, and you can use that for testing the file.
  • The file's full name is $File::Find::name.
  • The file's directory is $File::Find::dir.

And, the easiest way is to push the files you want into an array, and then use that array later in your program.

like image 173
David W. Avatar answered Nov 16 '22 02:11

David W.


Removing the sub from sub wanted; just makes it a call to the wanted function, not a forward declaration.

However, the wanted function hasn't been designed to be called directly from your code - it's been designed to be called by File::Find. File::Find does useful stuff like populating$_ before calling it.

There's no need to forward-declare wanted here, but if you want to remove the forward declaration, remove the whole sub wanted; line - not just the word sub.

like image 44
tobyink Avatar answered Nov 16 '22 00:11

tobyink