Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl Challenge - Directory Iterator

You sometimes hear it said about Perl that there might be 6 different ways to approach the same problem. Good Perl developers usually have well-reasoned insights for making choices between the various possible methods of implementation.

So an example Perl problem:

A simple script which recursively iterates through a directory structure, looking for files which were modified recently (after a certain date, which would be variable). Save the results to a file.

The question, for Perl developers: What is your best way to accomplish this?

like image 422
keparo Avatar asked Oct 02 '08 10:10

keparo


3 Answers

This sounds like a job for File::Find::Rule:

#!/usr/bin/perl
use strict;
use warnings;
use autodie;  # Causes built-ins like open to succeed or die.
              # You can 'use Fatal qw(open)' if autodie is not installed.

use File::Find::Rule;
use Getopt::Std;

use constant SECONDS_IN_DAY => 24 * 60 * 60;

our %option = (
    m => 1,        # -m switch: days ago modified, defaults to 1
    o => undef,    # -o switch: output file, defaults to STDOUT
);

getopts('m:o:', \%option);

# If we haven't been given directories to search, default to the
# current working directory.

if (not @ARGV) {
    @ARGV = ( '.' );
}

print STDERR "Finding files changed in the last $option{m} day(s)\n";


# Convert our time in days into a timestamp in seconds from the epoch.
my $last_modified_timestamp = time() - SECONDS_IN_DAY * $option{m};

# Now find all the regular files, which have been modified in the last
# $option{m} days, looking in all the locations specified in
# @ARGV (our remaining command line arguments).

my @files = File::Find::Rule->file()
                            ->mtime(">= $last_modified_timestamp")
                            ->in(@ARGV);

# $out_fh will store the filehandle where we send the file list.
# It defaults to STDOUT.

my $out_fh = \*STDOUT;

if ($option{o}) {
    open($out_fh, '>', $option{o});
}

# Print our results.

print {$out_fh} join("\n", @files), "\n";
like image 142
pjf Avatar answered Sep 21 '22 09:09

pjf


Where the problem is solved mainly by standard libraries use them.

File::Find in this case works nicely.

There may be many ways to do things in perl, but where a very standard library exists to do something, it should be utilised unless it has problems of it's own.

#!/usr/bin/perl

use strict;
use File::Find();

File::Find::find( {wanted => \&wanted}, ".");

sub wanted {
  my (@stat);
  my ($time) = time();
  my ($days) = 5 * 60 * 60 * 24;

  @stat = stat($_);
  if (($time - $stat[9]) >= $days) {
    print "$_ \n";
  }
}
like image 36
Philip Reynolds Avatar answered Sep 22 '22 09:09

Philip Reynolds


There aren't six ways to do this, there's the old way, and the new way. The old way is with File::Find, and you already have a couple of examples of that. File::Find has a pretty awful callback interface, it was cool 20 years ago, but we've moved on since then.

Here's a real life (lightly amended) program I use to clear out the cruft on one of my production servers. It uses File::Find::Rule, rather than File::Find. File::Find::Rule has a nice declarative interface that reads easily.

Randal Schwartz also wrote File::Finder, as a wrapper over File::Find. It's quite nice but it hasn't really taken off.

#! /usr/bin/perl -w

# delete temp files on agr1

use strict;
use File::Find::Rule;
use File::Path 'rmtree';

for my $file (

    File::Find::Rule->new
        ->mtime( '<' . days_ago(2) )
        ->name( qr/^CGItemp\d+$/ )
        ->file()
        ->in('/tmp'),

    File::Find::Rule->new
        ->mtime( '<' . days_ago(20) )
        ->name( qr/^listener-\d{4}-\d{2}-\d{2}-\d{4}.log$/ )
        ->file()
        ->maxdepth(1)
        ->in('/usr/oracle/ora81/network/log'),

    File::Find::Rule->new
        ->mtime( '<' . days_ago(10) )
        ->name( qr/^batch[_-]\d{8}-\d{4}\.run\.txt$/ )
        ->file()
        ->maxdepth(1)
        ->in('/var/log/req'),

    File::Find::Rule->new
        ->mtime( '<' . days_ago(20) )
        ->or(
            File::Find::Rule->name( qr/^remove-\d{8}-\d{6}\.txt$/ ),
            File::Find::Rule->name( qr/^insert-tp-\d{8}-\d{4}\.log$/ ),
        )
        ->file()
        ->maxdepth(1)
        ->in('/home/agdata/import/logs'),

    File::Find::Rule->new
        ->mtime( '<' . days_ago(90) )
        ->or(
            File::Find::Rule->name( qr/^\d{8}-\d{6}\.txt$/ ),
            File::Find::Rule->name( qr/^\d{8}-\d{4}\.report\.txt$/ ),
        )
        ->file()
        ->maxdepth(1)
        ->in('/home/agdata/redo/log'),

) {
    if (unlink $file) {
        print "ok $file\n";
    }
    else {
        print "fail $file: $!\n";
    }
}

{
    my $now;
    sub days_ago {
        # days as number of seconds
        $now ||= time;
        return $now - (86400 * shift);
    }
}
like image 33
dland Avatar answered Sep 22 '22 09:09

dland