I'm not thrilled with the argument-passing architecture I'm evolving for the (many) Perl scripts that have been developed for some scripts that call various Hadoop MapReduce jobs.
There are currently 8 scripts (of the form run_something.pl) that are run from cron. (And more on the way ... we expect anywhere from 1 to 3 more for every function we add to hadoop.) Each of these have about 6 identical command-line parameters, and a couple command line parameters that are similar, all specified with Euclid.
The implementations are in a dozen .pm modules. Some of which are common, and others of which are unique....
Currently I'm passing the args globally to each module ...
Inside run_something.pl I have:
set_common_args (%ARGV);
set_something_args (%ARGV);
And inside Something.pm I have
sub set_something_args { (%MYARGS) =@_; }
So then I can do
if ( $MYARGS{'--needs_more_beer'} ) {
$beer++;
}
I'm seeing that I'm probably going to have additional "common" files that I'll want to pass args to, so I'll have three or four set_xxx_args calls at the top of each run_something.pl, and it just doesn't seem too elegant.
On the other hand, it beats passing the whole stupid argument array down the call chain, and choosing and passing individual elements down the call chain is (a) too much work (b) error-prone (c) doesn't buy much.
In lots of ways what I'm doing is just object-oriented design without the object-oriented language trappings, and it looks uglier without said trappings, but nonetheless ...
Anyone have thoughts or ideas?
If you want to use the two arguments as input files, you can just pass them in and then use <> to read their contents. Alternatively, @ARGV is a special variable that contains all the command line arguments. $ARGV[0] is the first (ie. "string1" in your case) and $ARGV[1] is the second argument.
Perl command line arguments stored in the special array called @ARGV . The array @ARGV contains the command-line arguments intended for the script. $#ARGV is generally the number of arguments minus one, because $ARGV[0] is the first argument, not the program's command name itself.
$ARGV. contains the name of the current file when reading from <>. @ARGV. The array ARGV contains the command line arguments intended for the script. Note that $#ARGV is the generally number of arguments minus one, since $ARGV[0] is the first argument, NOT the command name.
In the same vein as Pedro's answer, but upgraded to use Moose and MooseX::Getopt, I present the SO community with... a Moose modulino*: a Moose module that can be included and run normally as a module, or separately as a command-line utility:
# this is all in one file, MyApp/Module/Foo.pm:
package MyApp::Module::Foo;
use Moose;
use MooseX::Getopt;
has [ qw(my config args here) ] => (
is => 'ro', isa => 'Int',
);
sub run { ... }
package main;
use strict;
use warnings;
sub run
{
my $module = MyApp::Module::Foo->new_with_options();
$module->run();
}
run() unless caller();
The module can be invoked using:
perl MyApp/Module/Foo.pm --my 0 --config 1 --args 2 --here 3
Using this pattern, you can collect command-line arguments using one module, which is use
d by all other modules and scripts that share the same options, and use standard Moose accessor methods for retrieving those options.
*modulinos are modules that can also be run as stand-alone scripts -- a Perl design pattern by SO's own brian d foy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With