In Leroy's paper on how recursive modules are typed in OCaml, it is written that modules are checked in an environment made of approximations of module types:
module rec A = ... and B = ... and C = ...
An environment {A -> approx(A); B -> approx(B); C -> approx(C) } is first built, and then used to compute the types of A, B and C.
I noticed that, in some cases, approximations are not good enough, and typechecking fails. In particular, when putting compilation units sources in a recursive module definition, typechecking can fail whereas the compiler was able to compile the compilation units separately.
Coming back to my first example, I found that a solution would be to type A in the initial approximated environment, but then to type B in that initial environment extended with the new computed type of A, and to type C in the previous env with the new computed type of B, and so on.
Before investigating more, I would like to know if there is any prior work on this subject, showing that such a compilation scheme for recursive modules is either safe or unsafe ? Is there a counter-example showing an unsafe program correctly typed with this scheme ?
First, note that Leroy (or Ocaml) does not allow module rec
without explicit signature annotations. So it's rather
module rec A : SA = ... and B : SB = ... and C : SC = ...
and the approximative environment is {A : approx(SA), B : approx(SB), C : approx(SC)}.
It is not surprising that some modules type-check when defined separately, but not when defined recursively. After all, the same is already true for core-language declarations: in a 'let rec', recursive occurrences of the bound variables are monomorphic, while with separated 'let' declarations, you can use previous variables polymorphically. Intuitively, the reason is that you cannot have all the knowledge that you'd need to infer the more liberal types before you have actually checked the definitions.
Regarding your suggestion, the problem with it is that it makes the module rec
construct unsymmetric, i.e. order would matter in potentially subtle ways. That violates the spirit of recursive definitions (at least in ML), which should always be indifferent to ordering.
In general, the issue with recursive typing is not so much soundness, but rather completeness. You don't want a type system that is either undecidable in general, or whose specification is dependent on algorithmic artefacts (like checking order).
On a more general note, it is well-known that Ocaml's treatment of recursive modules is rather restrictive. There has been work, e.g. by Nakata & Garrigue, that pushes its limits further. However, I am convinced that ultimately, you won't be able to get as liberal as you'd like (and that applies to other aspects of its type module system as well, e.g. functors) without abandoning Ocaml's purely syntactic approach to module typing. But then, I'm biased. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With