Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Behaviour of require (static + dynamic) [ RAKU ]

My question is related to the behaviour of require when used with static or dynamic resolution of the desired namespace.

I'll try to present my understanding of things:

[ 1 ] Use "require" with a literal

    { require MODULE; }

In this case the compiler checks to see if MODULE has already been declared as a symbol. If it hasn't, the compiler declares it, and binds it to an empty placeholder package it's just created for this "require"

{
    my $response = ::('MODULE');  # this happens at runtime
    say $response.^name;          # MODULE doesn't exist so the lookup results in the compilation-phase placeholder package: MODULE

    try require MODULE;           # although the execution order of require comes after the lookup, 
                                  # the placeholder package creation was done during compilation and the package is present in the current scope during run-time
}

[ 2 ] Use "require" with a String

    { try require 'FILE_PATH'; }

In this case "require" is trying to find (at run-time) a file that is defined by the filename declared in the string. If found (with appropriate content: modules, packages etc.) then it creates a namespace(s) in the current scope and loads it with the content(s) of the file.

[ 3 ] Use "require" with a dynamic lookup

    { try require ::('MODULE'); }

It seems to me that in that case "require" behaves NOT as a "normal" subroutine.

When we use "require" with "dynamic lookup" then the core functionality of the dynamic lookup is "melted" in a new routine that behaves differently than we would expect.

The fact is that the result of the "dynamic lookup" routine is either a symbol or a Failure.

If "require" behaves like a "normal" subroutine, then the only input it could use, would be the result of the dynamic lookup that followed it (Namespace or Failure).

But it is also a fact that in the case of a Failure (as the result of dynamic lookup), "require" continues searching the repositories for a proper package (as is normally the case, using nevertheless the argument we gave to dynamic lookup: 'MODULE').

So obviously "require" isn't behaving like a "normal" subroutine in that sense.

As the result of my line of thought the combination of require + dynamic lookup resembles the following construct:

{ modified_dynamic_lookup('MODULE') :if_symbol_not_found_search_repositories_and_if_appropriate_package_found_create_namespace_and_load_package_contents; }

My concern is my understanding of case [3].

How does require + dynamic lookup work? (analytically speaking - what are the steps followed by the compiler at first and then by the runtime?)

[ Post Scriptum ]

I agree with @raiph that "require" is NOT a subroutine and that it is deeply integrated in the language.

In that sense the "dynamic lookup construct" that follows the require "instruction" is used for 2 things:

  1. To notify the compiler that the construct is "dynamic" (so don't bother fixing anything at compile time)

  2. To provide the string that will be used to search for symbols, namespaces, files or repository content

@raiph states that he thinks that "require" does a lookup after a successful load.

My only objection about that is that when we load the same library "require" doesn't throw any exception.

Is it silently ignoring the loaded library? Why bother doing so much work when it can check first that the same namespace is already in use?

In contrary when we pretend that we load a different library then it throws an Exception : "duplicate definition" of the symbol in use.

To demostrate that I conducted the following:

In ./lib directory I place two libraries, the "foo.pm6" which is a unit definition of "foo" with a class A defined in it:

file "foo.pm6" contents:
-----------------------------------
unit module foo;

class A is export {}

and another library "other.pm6" that has inside a definition of "foo" this time with a different class B defined in it.

file "other.pm6" contents:
-----------------------------------
module foo {
    class B is export {}
}

The raku program file contains the following:

use lib <lib>;

my $name = 'other';           # select one of {'other', 'foo'}

require ::('foo') <A>;        ########> Initial package loading

my $a = try ::('foo::A').new;
say '(1) ' ~ $a.^name;        # (1) foo::A

$a = ::('A').new;
say '(2) ' ~ $a.^name;        # (2) foo::A

try require ::($name);        # if $name eq 'other' => throws exception, if $name eq 'foo' => does nothing
with $! {.say};               # P6M Merging GLOBAL symbols failed: duplicate definition of symbol foo ...

$a = try ::('foo::A').new;
say '(3) ' ~ $a.^name;        # (3) foo::A

$a = ::('A').new;
say '(4) ' ~ $a.^name;        # (4) foo::A

From the example above we see that when we try to reload the foo namespace, hidden in a file with a different name (just to trick raku) it throws an exception.

Therefore I conclude that maybe "require" checks first for a namespace that has the same name as the provided string.

By the way, checking about this, I stumbled upon a strange behaviour. It's the following:

If we use "use foo;" in line: "Initial package loading" instead of "require ::('foo') ;", we get the following results:

(1) foo::A
(2) foo::A
No such symbol 'other' ...

(3) Any
(4) foo::A

The lookup of 'foo::A' in (3) doesn't find anything !!!

Furthermore if I change the library file: "other.pm6" with the following (class A instead of B - as in the foo.pm6)

file "other.pm6" contents:
-----------------------------------
module foo {
    class A is export {}
}

the result seem to revert to the expected:

(1) foo::A
(2) foo::A
No such symbol 'other' ...

(3) foo::A
(4) foo::A

Is it a bug or something else that I'm missing?

like image 948
jakar Avatar asked Jun 01 '20 11:06

jakar


2 Answers

Rewritten to correspond to the third version of your answer.

[ 1 ] Use "require" with a literal

In this case the compiler checks to see if MODULE has already been declared as a symbol. If it hasn't, the compiler declares it, and binds it to an empty placeholder package it's just created for this "require"

To be a bit more specific, the require keyword, and the code generated by it4, does the work.

And the only reason it's created the symbol is so that one can write that identifier and the code will compile. If require didn't do that then code that uses the identifier would fail to compile even if the require FOO would have succeeded:

require FOO;
my FOO $bar; # Type 'FOO' is not declared

# MODULE doesn't exist so the lookup results in the compilation-phase placeholder package: MODULE

MODULE does exist. And the lookup succeeds. It returns the value bound to the MODULE symbol, which is the placeholder package that require put there during the compilation phase.

# although the execution order of require comes after the lookup

The execution of require's compilation-phase actions came before the lookup which happens during the run phase.

[ 2 ] Use "require" with a String**

If found (with appropriate content: modules, packages etc.) then it creates a namespace(s) in the current scope and loads it with the content(s) of the file.

I think the only declaration of symbols require does is the ones the code writer has explicitly written as static identifiers as part of the require statement. Examples:

  • require MODULE <A>; --> MODULE and A.

  • require 'MODULE.pm6' <A>; --> A.

  • require ::('MODULE') <A>; --> A.

Aiui the MLS1, as part of symbol merging (P6M), declares further symbols as necessary. But this work isn't being done by require. It's done by MLS on its behalf. And it isn't peculiar to require. It's the same (sort of) work that happens during the compile-phase as a result of a use statement.

[ 3 ] Use "require" with a dynamic lookup

{ try require ::('MODULE'); }

I have code that is an attempt to demonstrate that this does not do a lookup before attempting to load the module.2

It seems to me that in that case "require" behaves NOT as a "normal" subroutine.

require is not a routine, normal or otherwise.

say require MODULE;   # Undeclared name:
                            MODULE used at line 1
                      # Undeclared routine:
                            require used at line 1

If you search for require in the official doc you'll see it's not listed in the Routine Reference section but rather the Modules part of the Language Reference. It's a keyword, a statement, a special part of the language that the compiler understands.

If "require" behaves like a "normal" subroutine, then the only input it could use, would be the result of the dynamic lookup that followed it (Namespace or Failure).

The result of a dynamic lookup is the value bound to a Symbol, if it's declared, or Failure otherwise:

my $variable = 42;
say ::('$variable');           # 42
say ::('nonsense') ~~ Failure; # True

$variable is not a Namespace.

But it is also a fact that in the case of a Failure (as the result of dynamic lookup), "require" continues searching the repositories for a proper package (as is normally the case, using nevertheless the argument we gave to dynamic lookup: 'MODULE').

Given the code I wrote tracking dynamic lookup of the value of ::('MODULE')2 it looks likely to me that there is no dynamic lookup of it by any code, whether require or the MLS, if the module loading fails.

That in turn implies that it only happens, if at all, during or after (successful) loading of a module. So either the MLS is doing it (seems most likely), or, perhaps, require is doing it after the module has been successfully loaded (seems unlikely but I'm not yet ready to 100% eliminate it).

{ modified_dynamic_lookup('MODULE') :if_symbol_not_found_search_repositories_and_if_appropriate_package_found_create_namespace_and_load_package_contents; }

I think I've proven that there is either no lookup at all by require or the MLS, or, if it does it, it's only after a module has been successfully loaded.

what are the steps followed by the compiler at first and then by the runtime?

This answer is of course an attempt to answer that but my brief compiler code analysis may be of some help.3 (Though clicking the link to see the actual code in Actions.nqp is not for the faint of heart!)

[ Post Scriptum ]

In that sense the "dynamic lookup construct" that follows the require "instruction" is used for 2 things:

  1. To notify the compiler that the construct is "dynamic" (so don't bother fixing anything at compile time)

  2. To provide the string that will be used to search for symbols, namespaces, files or repository content

I think it only does 2, just a package name that's passed to the MLS.

when we load the same library "require" doesn't throw any exception. Is it silently ignoring the loaded library?

I don't think require knows anything about it. It hands it off to the MLS and then picks up after the MLS has done its thing. I don't think require can tell the difference between when MLS does a successful fresh load and when it just skips the load. All it knows is whether MLS says all is good or there's an exception.

Why bother doing so much work when it can check first that the same namespace is already in use?

Why bother doing any work when the MLS already does it, and require is going to invoke the MLS anyway? Doing anything is wasted effort.

All require has to do is deal with the compile-phase symbols the user has explicitly typed in the require statement. It can't ask the MLS to deal with those because it's got nothing to do with a successful module load, and that's the only scenario in which the MLS goes fiddling with symbols.

In contrary when we pretend that we load a different library then it throws an Exception : "duplicate definition" of the symbol in use.

Try this:

require ::('foo');
require ::('other');

Now try it again when you change the unit module foo; in foo.pm6 and other.pm6 to unit module bar;. You'll still get the same exception, but the symbol will be bar. How can require know about bar? It can't. The exception is coming from the MLS and the symbol is only known about by the MLS.

Therefore I conclude that maybe "require" checks first for a namespace that has the same name as the provided string.

Unless you count the MLS as part of require, I trust you can now see that your "maybe" qualification was wise. :)

I stumbled upon a strange behaviour ... The lookup of 'foo::A' in (3) doesn't find anything !!!

I've got an explanation for that. I'm not saying it's right, but it doesn't seem too strange to me as I write this:

The use statement loads the foo.pm6 package. It defines a package foo, which contains a class A, and exports A. That results in a symbol in the importing lexical scope foo, which is bound to a package, which package contains a symbol A. It also results in another symbol in the importing lexical scope, A.

The require statement loads the other.pm6 package. It defines a package foo, which contains a class B, and exports B. That results in rebinding the foo symbol in the importing lexical scope to a different package, namely the new package containing the symbol B. It also results in another symbol in the importing lexical scope, B.

The earlier A hangs around. (In other words the P6M symbol merging process doesn't include removing symbols.) But foo::A, which is looked up in the package bound to the foo symbol, no longer exists, because the package bound to the foo symbol is now the one from the other.pm6 package, having overwritten the one from the foo.pm6 package.

In the meantime there's another oddity:

try require ::($name);
with $! {.say};             # No such symbol 'other' ...

I think this reflects require doing a (failed) lookup after a successful module load.

Note that this message does not appear if the module fails to load; this seems to again confirm my thinking (and code2) that require does not do any lookup until after a successful load (if that; I still don't have a strong sense about whether it's the MLS that's doing this stuff or the require; the code4 is too complex for me atm).

Responses to your comments

From your comments on this answer:

Its like we get as the result of the amalgamation of require + 'dynamic lookup formulation' an enhanced dynamic lookup like this { ::('something') :if_not_found_as_namespace_check_repositories_and_load }

That doesn't ring true for me for various reasons.

For example, presume there's a package foo declared as module foo { our sub bar is export { say 99 } } that will successfully load if required. Now consider this code:

my \foo = 42;
say ::('foo');             # 42
require ::('foo') <&bar>;
say foo;                   # 42
bar;                       # 99

This makes sense to me. It won't have loaded a package whose name is42. It won't have looked up the symbol foo. It will have loaded the package whose name is foo. And while it presumably will have looked up symbol foo after loading the package, it won't have installed a symbol foo because there's already one.

Footnotes

1 By Module Loading Subsystem I mean the various parts that, given a module name, do things like searching the local file system, or a database, checking precompilation directories, invoking compilation, and merging symbols if a module successfully loads. I don't know where the boundaries are between the parts, and the parts and the compiler. But I'm confident they are not part of require but merely invoked by it.


2 Run this code:

my \MODULE =
  { my $v;
    Proxy.new:
      FETCH => method { say "get name: $v"; $v },
      STORE => method ($n) { say "set name: $n"; $v = $n }}();

MODULE = 'unseen by `require`';
say ::('MODULE');

use lib '.';
say 'about to `require`';
require ::('MODULE');

3 We start with the relevant match in Raku's Grammar.nqp file:

  rule statement_control:sym<require> {
        <sym>
        [
        | <module_name>
        | <file=.variable>
        | <!sigil> <file=.term>
        ]
        <EXPR>?
    }

The code seems to follow what we expect -- a require keyword followed by either:

  • a package identifier (<module_name>); or

  • a <variable> (eg $foo); or

  • a <term> that doesn't start with a <sigil>.

We're interested in the <module_name> branch. It calls token module_name which calls token longname which calls token name:

token name {
        [
        | <identifier> <morename>*
        | <morename>+
        ]
}

Clearly ::('foo') doesn't begin with an <identifier>. So it's token morename. I'll cut out a few uninteresting lines to leave:

    token morename {
        '::'
        [
        ||  <?before '(' | <.alpha> >
            [
            | <identifier>
            | :dba('indirect name') '(' ~ ')' [ <.ws> <EXPR> ]
            ]
        ]?
    }

Bingo. That'll match ::(, in particular the :dba('indirect name') '(' ~ ')' [ <.ws> <EXPR> ] bit.

So at this point we'll have captured:

statement_control:sym<require><module_name><longname><name><morename><EXPR>

A short while later the statement_control:sym<require> token will be about to succeed. So at that point it will call the corresponding action method in Actions.nqp...


4 In Actions.nqp we find the action corresponding to token statement_control:sym<require>, namely method statement_control:sym<require>. The opening if $<module_name> { conditional will be True, leading to running this code:

$longname := $*W.dissect_longname($<module_name><longname>);
$target_package := $longname.name_past;

It looks to me like this code is dissecting the result of parsing ::('foo'), and binding AST corresponding to that dissection to $target_package, rather than bothering to do a lookup or prepare a run-time lookup.

If I'm right, the ::('foo') does not need to be anything more than 9 characters that require gets to interpret however it fancies interpreting them. There's no necessary implication here it does any particular thing, such as a lookup, as it constructs the package loading code.


The latter half of the action does do lookups. There are lines like this:

?? self.make_indirect_lookup($longname.components())

and given the routine name I presume that that is doing a lookup, perhaps as part of where require attempts to add a package symbol if the package load succeeds.

like image 120
raiph Avatar answered Sep 19 '22 23:09

raiph


require does some things during compilation if it can.

require Module;
say Module;

It assumes that loading that module will give you something with the name of Module.

So it installs a temporary symbol with that name at compile time.

That is the only thing it does at compile time.
(So I fibbed when I said “some things”.)

if Bool.pick {
    require module-which-does-not-exist;

    module-which-does-not-exist.method-call()
}

About half of the time the above does nothing.
The other half of the time it complains at run-time that it can't find the module.

(I chose Bool.pick instead of False so the compile-time optimizer definitely can't optimize it away.)


When you call it with something other than an identifier, it doesn't know at compile-time what the module will be. So it can't create a temporary namespace.

require 'Module';
say Module; # COMPILE ERROR: undeclared name
require Module; # RUNTIME ERROR: can't find 'Module'
say Module;
require 'Module'; # RUNTIME ERROR: can't find 'Module'
say ::('Module');
if False {
    require Module;
    say Module;
}
# no error at all
if False {
    require 'Module';
    say ::('Module');
}
# no error at all
like image 29
Brad Gilbert Avatar answered Sep 20 '22 23:09

Brad Gilbert