Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can an OCaml module export a field defined in a dependent module?

Tags:

module

ocaml

I have a decomposition where module A defines a structure type, and exports a field of this type which is defined as a value in module B:

a.ml:

type t = {
  x : int
}

let b = B.a

b.ml:

open A (* to avoid fully qualifying fields of a *)
let a : t = {
  x = 1;
}

Circular dependence is avoided, since B only depends on type declarations (not values) in A.

a.mli:

type t = {
  x : int
}

val b : t

As far as I know, this should be kosher. But the compiler errors out with this:

File "a.ml", line 1, characters 0-1:
Error: The implementation a.ml does not match the interface a.cmi:
       Values do not match: val b : A.t is not included in val b : t

This is all particularly obtuse, of course, because it is unclear which val b is interpreted as having type t and which has type A.t (and to which A--the interface definition or the module definition--this refers).

I'm assuming there is some arcane rule (along the lines of the "structure fields must be referenced by fully module-qualified name when the module is not opened" semantics which bite every OCaml neophyte at some point), but I am so far at a loss.

like image 697
jrk Avatar asked Dec 05 '11 07:12

jrk


2 Answers

Modules in the microscope are more subtle than it appears

(If your eyes glaze over at some point, skip to the second section.)

Let's see what would happen if you put everything in the same file. This should be possible since separate computation units do not increase the power of the type system. (Note: use separate directories for this and for any test with files a.* and b.*, otherwise the compiler will see the compilation units A and B which may be confusing.)

module A = (struct
    type t = { x : int }
    let b = B.a
  end : sig
    type t = { x : int }
    val b : t
  end)
module B = (struct
    let a : A.t = { A.x = 1 }
  end : sig
    val a : A.t
  end)

Oh, well, this can't work. It's obvious that B is not defined here. We need to be more precise about the dependency chain: define the interface of A first, then the interface of B, then the implementations of B and A.

module type Asig = sig
    type t = { x : int }
    type u = int
    val b : t
  end
module B = (struct
    let a : Asig.t = { Asig.x = 1 }
  end : sig
    val a : Asig.t
  end)
module A = (struct
    type t = { x : int }
    let b = B.a
  end : Asig)

Well, no.

File "d.ml", line 7, characters 12-18:
Error: Unbound type constructor Asig.t

You see, Asig is a signature. A signature is a specification of a module, and no more; there is no calculus of signatures in Ocaml. You cannot refer to fields of a signature. You can only refer to fields of a module. When you write A.t, this refers to the type field named t of the module A.

In Ocaml, it is fairly rare for this subtlety to arise. But you tried poking at a corner of the language, and this is what's lurking there.

So what's going on then when there are two compilation units? A closer model is to see A as a functor which takes a module B as an argument. The required signature for B is the one described in the interface file b.mli. Similarly, B is a function which takes a module A whose signature is given in a.mli as an argument. Oh, wait, it's a bit more involved: A appears in the signature of B, so the interface of B is really defining a functor that takes an A and produces a B, so to speak.

module type Asig = sig
    type t = { x : int }
    type u = int
    val b : t
  end
module type Bsig = functor(A : Asig) -> sig
    val a : A.t
  end
module B = (functor(A : Asig) -> (struct
    let a : A.t = { A.x = 1 }
  end) : Bsig)
module A = functor(B : Bsig) -> (struct
    type t = { x : int }
    let b = B.a
  end : Asig)

And here, when defining A, we run into a problem: we don't have an A yet, to pass as an argument to B. (Barring recursive modules, of course, but here we're trying to see why we can't get by without them.)

Defining a generative type is a side effect

The fundamental sticking point is that type t = {x : int} is a generative type definition. If this fragment appears twice in a program, two different types are defined. (Ocaml takes steps and forbids you to define two types with the same name in the same module, except at the toplevel.)

In fact, as we've seen above, type t = {x : int} in a module implementation is a generative type definition. It means “define a new type, called d, which is a record type with the fields …”. That same syntax can appear in a module interface, but there it has a different meaning: there, it means “the module defines a type t which is a record type …”.

Since defining a generative type twice creates two distinct types, the particular generative type that is defined by A cannot be fully described by the specification of the module A (its signature). Hence any part of the program that uses this generative type is really using the implementation of A and not just its specification.

When you get down to it, defining a generative type it is a form of side effect. This side effect happens at compile time or at program initialization time (the distinction between these two only appears when you start looking at functors, which I shall not do here.) So it is important to keep track of when this side effect happens: it happens when the module A is defined (compiled or loaded).

So, to express this more concretely: the type definition type t = {x : int} in the module A is compiled into “let t be type #1729, a fresh type which is a record type with a field …”. (A fresh type means one that is different from any type that has ever been defined before.). The definition of B defines a to have the type #1729.

Since the module B depends on the module A, A must be loaded before B. But the implementation of A clearly uses the implementation of B. The two are mutually recursive. Ocaml's error message is a little confusing, but you are indeed outstepping the bounds of the language.

like image 146
Gilles 'SO- stop being evil' Avatar answered Nov 03 '22 01:11

Gilles 'SO- stop being evil'


(and to which A--the interface definition or the module definition--this refers).

A refers to the whole module A. With the normal build procedure it would refer to the implementation in a.ml contrained by signature in a.mli. But if you are playing tricks moving cmi's around and such - you are on your own :)

As far as I know, this should be kosher.

I personally qualify this issue as circular dependency and would stay strongly against structuring the code in such a way. IMHO it causes more problems and head-scratching, than solving real issues. E.g. moving shared type definitions to type.ml and be done with it is what comes first to mind. What is your original problem that leads to such structuring?

like image 29
ygrek Avatar answered Nov 02 '22 23:11

ygrek