Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

should I read from ARGV in own perl module

My Perl extracts and processes data from (multiple) log file(s), currently processing all files in @ARGV.

The most important part of this script is the log decoding itself, it incorporates a lot of knowledge about the log file format. This transforming part from log (actually into an array of hashs) has proven to be subject of change (as the log format evolves), and to be the basis for further processing steps: there are often specific questions to answer from the decoded records which is done best right in Perl, that's why I'm thinking of making it a module.

The core function is using nested (or name it scoped) pattern matching sitting in a while (<>) loop:

while (<ARGV>) {
    $totalLines ++;
    if (m/^(\d{4}-\d\d-\d\d \d\d:\d\d:\d\d) L(\d) (.+)/) {
        my $time = $1;
        my $line = $2;
        my $event = $3;
        if ($event =~ m/^connect: (.+)$/) {
            $pendings{$line}{station} = $1;
            ...

...more than 200 lines follow before the closing braces.

I have the feeling that simply reading from ARGV would exceed the Do one thing and do it well rule. When I searched the web, I found nothing that speaks explicitly for or against reading from ARGV in module, but maybe my search patterns were just poor. [1][2]

(How) should I re-frame my decoding for placing it into a module?
...or should I change my feelings about this?


[1]perltrap - perldoc.perl.org
[2]perlmodstyle - perldoc.perl.org

like image 810
Wolf Avatar asked Mar 10 '23 14:03

Wolf


2 Answers

You can make your function unaware of <ARGV> iterator logic,

sub foo {
    my ($iter) = @_;

    # `defined()` should be used explicitly unlike `while (<ARGV>)`
    while (defined (my $line = $iter->())) {
        # if ..
    }
}

foo(sub{ scalar <ARGV> }); # force scalar context; one line/record per call
like image 111
mpapec Avatar answered Mar 15 '23 00:03

mpapec


I would write it such that it accepts any file handle. Then, you could use \*ARGV as the argument.

Also, don't clobber your caller's $_. $_ is often aliased to other variables (which would have far-reaching consequences) and constants (which would cause your code to fail). Use your own lexically-scoped variable instead (or at least add local *_; first).

like image 34
ikegami Avatar answered Mar 15 '23 01:03

ikegami