Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alter how arguments are processed before they're passed to sub MAIN

Given the documentation and the comments on an earlier question, by request I've made a minimal reproducible example that demonstrates a difference between these two statements:

my %*SUB-MAIN-OPTS = :named-anywhere;
PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True;

Given a script file with only this:

#!/usr/bin/env raku
use MyApp::Tools::CLI;

and a module file in MyApp/Tools called CLI.pm6:

#PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True;
my %*SUB-MAIN-OPTS = :named-anywhere;

proto MAIN(|) is export {*}

multi MAIN( 'add', :h( :$hostnames ) ) {
    for @$hostnames -> $host {
        say $host;
    }
}

multi MAIN( 'remove', *@hostnames ) {
    for @hostnames -> $host {
        say $host;
    }
}

The following invocation from the command line will not result in a recognized subroutine, but show the usage:

mre.raku add -h=localhost -h=test1

Switching my %*SUB-MAIN-OPTS = :named-anywhere; for PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True; will print two lines with the two hostnames provided, as expected.

If however, this is done in a single file as below, both work identical:

#!/usr/bin/env raku

#PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True;
my %*SUB-MAIN-OPTS = :named-anywhere;

proto MAIN(|) is export {*}

multi MAIN( 'add', :h( :$hostnames )) {
    for @$hostnames -> $host {
        say $host;
    }
}

multi MAIN( 'remove', *@hostnames ) {
    for @hostnames -> $host {
        say $host;
    }
}

I find this hard to understand. When reproducing this, be alert of how each command must be called.

mre.raku remove localhost test1
mre.raku add -h=localhost -h=test1

So a named array-reference is not recognized when this is used in a separate file with my %*SUB-MAIN-OPTS = :named-anywhere;. While PROCESS::<%SUB-MAIN-OPTS><named-anywhere> = True; always works. And for a slurpy array, both work identical in both cases.

like image 414
acw Avatar asked Jun 06 '20 23:06

acw


2 Answers

The problem is that it isn't the same variable in both the script and in the module.

Sure they have the same name, but that doesn't mean much.

my \A = anon class Foo {}
my \B = anon class Foo {}

A ~~ B; # False
B ~~ A; # False
A === B; # False

Those two classes have the same name, but are separate entities.


If you look at the code for other built-in dynamic variables, you see something like:

Rakudo::Internals.REGISTER-DYNAMIC: '$*EXECUTABLE-NAME', {
    PROCESS::<$EXECUTABLE-NAME> := $*EXECUTABLE.basename;
}

This makes sure that the variable is installed into the right place so that it works for every compilation unit.

If you look for %*SUB-MAIN-OPTS, the only thing you find is this line:

    my %sub-main-opts   := %*SUB-MAIN-OPTS // {};

That looks for the variable in the main compilation unit. If it isn't found it creates and uses an empty Hash.

So when you try do it in a scope other than the main compilation unit, it isn't in a place where it could be found by that line.


To test if adding that fixes the issue, you can add this to the top of the main compilation unit. (The script that loads the module.)

BEGIN Rakudo::Internals.REGISTER-DYNAMIC: '%*SUB-MAIN-OPTS', {
    PROCESS::<%SUB-MAIN-OPTS> := {}
}

Then in the module, write this:

%*SUB-MAIN-OPTS = :named-anywhere;

Or better yet this:

%*SUB-MAIN-OPTS<named-anywhere> = True;

After trying this, it seems to work just fine.


The thing is, that something like that used to be there.

It was removed on the thought that it slows down every Raku program.

Though I think that any slowdown it caused would still be an issue as the line that is still there has to look to see if there is a dynamic variable of that name.
(There are more reasons given, and I frankly disagree with all of them.)

like image 54
Brad Gilbert Avatar answered Nov 01 '22 05:11

Brad Gilbert


May a cuppa bring enlightenment to future SO readers pondering the meaning of things.[1]

Related answers by Liz

I think Liz's answer to an SO asking a similar question may be a good read for a basic explanation of why a my (which is like a lesser our) in the mainline of a module doesn't work, or at least confirmation that core devs know about it.

Her later answer to another SO explains how one can use my by putting it inside a RUN-MAIN.

Why does a slurpy array work by default but not named anywhere?

One rich resource on why things are the way they are is the section Declaring a MAIN subroutine of S06 (Synopsis on Subroutines)[2].

A key excerpt:

As usual, switches are assumed to be first, and everything after the first non-switch, or any switches after a --, are treated as positionals or go into the slurpy array (even if they look like switches).

So it looks like this is where the default behavior, in which nameds can't go anywhere, comes from; it seems that @Larry[3] was claiming that the "usual" shell convention was as described, and implicitly arguing that this should dictate that the default behavior was as it is.

Since Raku was officially released RFC: Allow subcommands in MAIN put us on the path to todays' :named-anywhere option. The RFC presented a very powerful 1-2 punch -- an unimpeachable two line hackers' prose/data argument that quickly led to rough consensus, with a working code PR with this commit message:

Allow --named-switches anywhere in command line.

Raku was GNU-like in that it has '--double-dashes' and that it stops interpreting named parameters when it encounters '--', but unlike GNU-like parsing, it also stopped interpreting named parameters when encountering any positional argument. This patch makes it a bit more GNU-like by allowing named arguments after a positional, to prepare for allowing subcommands.

> Alter how arguments are processed before they're passed to sub MAIN

In the above linked section of S06 @Larry also wrote:

Ordinarily a top-level Raku "script" just evaluates its anonymous mainline code and exits. During the mainline code, the program's arguments are available in raw form from the @*ARGS array.

The point here being that you can preprocess @*ARGS before they're passed to MAIN.

Continuing:

At the end of the mainline code, however, a MAIN subroutine will be called with whatever command-line arguments remain in @*ARGS.

Note that, as explained by Liz, Raku now has a RUN-MAIN routine that's called prior to calling MAIN.

Then comes the standard argument processing (alterable by using standard options, of which there's currently only the :named-anywhere one, or userland modules such as SuperMAIN which add in various other features).

And finally @Larry notes that:

Other [command line parsing] policies may easily be introduced by calling MAIN explicitly. For instance, you can parse your arguments with a grammar and pass the resulting Match object as a Capture to MAIN.

A doc fix?

Yesterday you wrote a comment suggesting a doc fix.

I now see that we (collectively) know about the coding issue. So why is the doc as it is? I think the combination of your SO and the prior ones provide enough anecdata to support at least considering filing a doc issue to the contrary. Then again Liz hints in one of the SO's that a fix might be coming, at least for ours. And SO is itself arguably doc. So maybe it's better to wait? I'll punt and let you decide. At least you now have several SOs to quote if you decide to file a doc issue.

Footnotes

[1] I want to be clear that if anyone perceives any fault associated with posting this SO then they're right, and the fault is entirely mine. I mentioned to @acw that I'd already done a search so they could quite reasonably have concluded there was no point in them doing one as well. So, mea culpa, bad coffee inspired puns included. (Bad puns, not bad coffee.)

[2] Imo these old historical speculative design docs are worth reading and rereading as you get to know Raku, despite them being obsolete in parts.

[3]@Larry emerged in Raku culture as a fun and convenient shorthand for Larry Wall et al, the Raku language team led by Larry.

like image 40
raiph Avatar answered Nov 01 '22 06:11

raiph