Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In OCaml, what is the difference between `'a.` and `type a.` and when to use each?

OCaml has several different syntaxes for a polymorphic type annotation :

let f :         'a -> 'a = … (* Isn’t this one already polymorphic? (answer: NO) *)
let f : 'a.     'a -> 'a = …
let f : type a.  a ->  a = …

We often see them when using fancy algebraic datatypes (typically, GADTs), where they seem to be necessary.

What is the difference between these syntaxes? When and why each one must be used?

like image 620
Maëlan Avatar asked Sep 11 '21 15:09

Maëlan


People also ask

What does type A mean in OCaml?

The type 'a is a type variable, and stands for any given type. The reason why sort can apply to lists of any type is that the comparisons (=, <=, etc.) are polymorphic in OCaml: they operate between any two values of the same type. This makes sort itself polymorphic over all list types.

What is type unit in OCaml?

unit — The Unit Value The built-in printing functions return type unit, for example; the value of a while or for loop (both of which are expressions, not "statements") is also of type unit. OCaml uses this convention to help catch more type errors.

What is type check OCaml?

OCaml has structural typing for objects rather than nominative typing as in Java. So the type of an object is basically determined (and only determined) by its methods. Objects in OCaml can be created directly, without going through something like a class.

What does well typed mean in OCaml?

The OCaml type system is also substantially more precise that than the type systems of C or Java. In order for an expression to be well-typed, the types expected by functions and operators must coincide with the types of their arguments.

How does the OCaml system compute the value and type?

The OCaml system computes both the value and the type for each phrase. Even function parameters need no explicit type declaration: the system infers their types from their usage in the function.

How do you write a function with multiple arguments in OCaml?

The OCaml notation for the type of a function with multiple arguments is arg1_type -> arg2_type ->... -> return_type. For example, the type inferred for insert, 'a -> 'a list -> 'a list, means that insert takes two arguments, an element of any type 'a and a list with elements of the same type 'a and returns a list of the same type.

Is OCaml a strongly typed language?

OCaml is a strongly statically typed language (in other words, there's nothing dynamic going on between int, float and string, as in Perl). OCaml uses type inference to work out the types, so you don't have to. If you use the OCaml interactive toplevel as above, then OCaml will tell you its inferred type for your function.

How does the comparison operator work in OCaml?

The comparison operators are polymorphicmeaning they work on most built-in data types (except where it doesn't make sense; e.g. you can't (usefully) compare two functions for equality). Note however that due to the strongly-typed nature of OCaml, you can't compare two values of different types; e.g. 1 = "foo"is not falsebut rather a type error.


Video Answer


3 Answers

Below are alternative explanations with a varying amount of detail, depending on how much of a hurry you’re in. ;-)

I will use the following code (drawn from that other question) as a running example. Here, the type annotation on the definition of reduce is actually required to make it typecheck.

(* The type [('a, 'c) fun_chain] represents a chainable list of functions, i.e.
 * such that the output type of one function is the input type of the next one;
 * ['a] and ['c] are the input and output types of the whole chain.
 * This is implemented as a recursive GADT (generalized algebraic data type). *)
type (_, _) fun_chain =
  | Nil  : ('a, 'a) fun_chain
  | Cons : ('a -> 'b) * ('b, 'c) fun_chain -> ('a, 'c) fun_chain

(* [reduce] reduces a chain to just one function by composing all
 * functions of the chain. *)
let rec reduce : type a c. (a, c) fun_chain -> a -> c =
  fun chain x ->
    begin match chain with
    | Nil              -> x
    | Cons (f, chain') -> reduce chain' (f x)
    end

The short story

On let-definitions, an annotation like : 'a -> 'a does not force polymorphism: the type-checker may refine the unification variable 'a to something. This bit of syntax is misleading indeed, because the same annotation on a val-declaration i.e. in a module signature does enforce polymorphism.

: type a. … is a type annotation with explicit (forced) polymorphism. You can think of this as the universal quantifier (∀ a, “for all a“). For instance,

let some : type a. a -> a option =
  fun x -> Some x

means that “for all” type a, you can give an a to some and then it will return an a option.

The code at the beginning of this answer makes use of advanced features of the type system, namely, polymorphic recursion and branches with different types, and that leaves type inference at a loss. In order to have a program typecheck in such a situation, we need to force polymorphism like this. Beware that in this syntax, a is a type name (no leading quote) rather than a type unification variable.

: 'a. … is another syntax that forces polymorphism, but it is practically subsumed by : type a. … so you will hardly need it at all.

The pragmatic story

: type a. … is a short-hand syntax that combines two features:

  1. an explicitly polymorphic annotation : 'a. …
    • useful for ensuring a definition is as general as intended
    • required when recursion is done with type parameters different from those of the initial call (“polymorphic recursion” i.e. recursion on “non-regular” ADTs)
  2. a locally abstract type (type a) …
    • required when different branches have different types (i.e. when pattern-matching on “generalized” ADTs)
    • allows you to refer to type a from inside the definition, typically when building a first-class module (I won’t say more about this)

Here we use the combined syntax because our definition of reduce falls under both situations in bold.

  1. We have polymorphic recursion because Cons builds a (a, c) fun_chain from a (b, c) fun_chain: the first type parameter differs (we say that fun_chain is a “non-regular” ADT).

  2. We have branches with different types because Nil builds a (a, a) fun_chain whereas Cons builds a (a, c) fun_chain (we say that fun_chain is a “generalized” ADT, or GADT for short).

Just to be clear: : 'a. … and : type a. … produce the same signature for the definition. Choosing one syntax or the other only has an influence on how its body is typechecked. For most intents and purposes, you can forget about : 'a. … and just remember the combined form : type a. …. Alas, the latter does not completely subsume the former, there are rare situations where writing : type a. … wouldn’t work and you would need : 'a. … (see @octachron’s answer) but, hopefully, you won’t stumble upon them often.

The long story

Explicit polymorphism

OCaml type annotations have a dirty little secret: writing let f : 'a -> 'a = … doesn’t force f to be polymorphic in 'a. The compiler unifies the provided annotation with the inferred type and is free to instantiate the type variable 'a while doing so, leading to a less general type than intended. For instance let f : 'a -> 'a = fun x -> x+1 is an accepted program and leads to val f : int -> int. To ensure the function is indeed polymorphic (i.e. to have the compiler reject the definition if it is not general enough), you have to make the polymorphism explicit, with the following syntax:

let f : 'a. 'a -> 'a = …

For a non-recursive definition, this is merely the human programmer adding a constraint which makes more programs be rejected.

In the case of a recursive definition however, this has another implication. When typechecking the body, the compiler will unify the provided type with the types of all occurrences of the function being defined. Type variables which are not marked as polymorphic will be made equal in all recursive calls. But polymorphic recursion is precisely when we recurse with differing type parameters; without explicit polymorphism, that would either fail or infer a less general type than intended. To make it work, we explicitly mark which type variables should be polymorphic.

Note that there is a good reason why OCaml cannot typecheck polymorphic recursion on its own: there is undecidability around the corner (see Wikipedia for references).

As an example, let’s do the job of the typechecker on this faulty definition, where polymorphism is not made explicit:

(* does not typecheck! *)
let rec reduce : ('a, 'c) fun_chain -> 'a -> 'c =
  fun chain x ->
    begin match chain with
    | Nil              -> x
    | Cons (f, chain') -> reduce chain' (f x)
    end

We start with reduce : ('a, 'c) fun_chain -> 'a -> 'c and chain : ('a, 'c) fun_chain for some type variables 'a and 'c.

  • In the first branch, chain = Nil, so we learn that in fact chain : ('c, 'c) fun_chain and 'a == 'c. We unify both type variables. (That doesn’t matter right now, though.)

  • In the second branch, chain = Cons (f, chain') so there exists an arbitrary type b such that f : 'a -> b and chain' : (b, 'c) fun_chain. Then we must typecheck the recursive call reduce chain', so the expected argument type ('a, 'c) fun_chain must unify with the provided argument type (b, 'c) fun_chain; but nothing tells us that b == 'a. So we reject this definition, preferably (as is tradition) with a cryptic error message:

Error: This expression has type ($Cons_'b, 'c) fun_chain
       but an expression was expected of type ('c, 'c) fun_chain
       The type constructor $Cons_'b would escape its scope

If now we make polymorphism explicit:

(* still does not typecheck! *)
let rec reduce : 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c =
  …

Then typechecking the recursive call is not a problem anymore, because we now know that reduce is polymorphic with two “type parameters” (non-standard terminology), and these type parameters are instantiated independently at each occurrence of reduce; the recursive call uses b and 'c even though the enclosing call uses 'a and 'c.

Locally abstract types

But we have a second problem: the other branch, for constructor Nil, has made 'a be unified with 'c. Hence we end up inferring a less general type than what the annotation mandated, and we report an error:

Error: This definition has type 'c. ('c, 'c) fun_chain -> 'c -> 'c
       which is less general than 'a 'c. ('a, 'c) fun_chain -> 'a -> 'c

The solution is to turn the type variables into locally abstract types, which cannot be unified (but we can still have type equations about them). That way, type equations are derived locally to each branch and they do not transpire outside of the match with construct.

like image 194
Maëlan Avatar answered Oct 23 '22 02:10

Maëlan


The practical answer when hesitating between 'a . ... and type a. ... is to always use the latter form:

  • type a. ... works with:
    • polymorphic recursion
    • GADTs
    • raise type errors early

whereas:

  • 'a. ... works with
    • polymorphic recursion
    • polymorphic quantification over row type variables

Thus type a. ... is generally a strictly superior version of 'a . ... .

Except for the last strange point. For the sake exhaustiveness, let me give an example of quantification over a row type variable:

let f: 'a. ([> `X ] as 'a) -> unit = function
  | `X -> ()
  | _ -> ()

Here the universal quantification allows us to control precisely the row variable type. For instance,

let f: 'a. ([> `X ] as 'a) -> unit = function
  | `X | `Y -> ()
  | _ -> ()

yields the following error

Error: This pattern matches values of type [? `Y ]
      but a pattern was expected which matches values of type [> `X ]
      The second variant type is bound to the universal type variable 'a,
      it may not allow the tag(s) `Y

This use case is not supported by the form type a. ... mostly because the interaction of locally abstract type, GADTs type refinement and type constraints has not been formalized. Thus this second exotic use case is not supported.

like image 32
octachron Avatar answered Oct 23 '22 03:10

octachron


TL;DR; In your question, only the last two forms are polymorphic type annotations. The latter of these two forms, in addition to annotating a type as polymorphic, introduces a locally abstract type1. This is the only difference.

The longer story

Now let's speak a little bit about the terminology. The following is not a type annotation (or, more properly, doesn't contain any type annotations),

let f :         'a -> 'a = …

It is called a type constraint. A type constraint requires the type of the defined value to be compatible with the specified type schemata.

In this definition,

let f : 'a.     'a -> 'a = …

we have a type constraint that includes a type annotation. The phrase "type annotation" in OCaml parlance means: annotating a type with some information, i.e., attaching some attribute or a property to a type. In this case, we annotate type 'a as polymorphic. We're not annotating the value f as polymorphic neither are we annotating the value f with type 'a -> 'a or 'a. 'a -> 'a. We are constraining the value of f to be compatible with type 'a -> 'a and annotate 'a as a polymorphic type variable.

For a long time, syntax 'a. was the only way to annotate type as polymorphic, but later OCaml introduced locally abstract types. They have the following syntax, which you could also add to your collection.

let f (type t) : t -> t = ...

Which creates a fresh abstract type constructor that you can use in the scope of the definition. It doesn't annotate t as polymorphic though, so if you want it to be explicitly annotated as polymorphic you could write,

let f : 'a. 'a -> 'a = fun (type t) (x : t) : t -> ...

which includes both an explicit type annotation of 'a as polymorphic and the introduction of a locally abstract type. Needless to say, it is cumbersome to write such constructions, so a little bit later (OCaml 4.00) they introduced syntactic sugar for that so that the above expression could be written as simple as,

let f : type t. t -> t = ...

Therefore, this syntax is just an amalgamation of two rather orthogonal features: locally abstract types and explicitly polymorphic types.

It is not however that the result of this amalgamation is stronger than its parts. It is more like an intersection. Whilst the generated type is both locally abstract and polymorphic, it is constrained to be a ground type. In other words, it constrains the kind of the type, but this is a completely different problem of higher-kinded polymorphism.

And to conclude the story, despite the syntax similarities, the following is not a type annotation,

val f : 'a -> 'a

It is called a value specification, which is a part of a signature, and it denotes that the value f has type 'a -> 'a.


1)) Locally abstract types have two main use cases. First, you can use them inside your expression in places where type variables are not permitted, e.g., in modules and exceptions definitions. Second, the scope of the locally abstract type exceeds the scope of the function, which you can employ by unifying types that are local to your expression with the abstract type to extend their scopes. The underlying idea is that the expression can not outlive its type and since in OCaml types could be created in runtime we have to be careful with the extent of the type as well. Unifying a locally created type with a locally abstract type via a function parameter guarantees that this type will be unified with some existing type in the place of the function application. Intuitively, it is like passing a reference for a type, so that the type could be returned from the function.

like image 23
ivg Avatar answered Oct 23 '22 01:10

ivg