Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Open and closed union types in Ocaml

I'm looking into OCaml for the first time, having a bit of background with F# and Haskell. As such, a lot is familiar-looking, but one thing that isn't is the concept of "open" and "closed" unions (with the backtick and [< syntax).

What are these useful for and how often are they used?

like image 984
J Cooper Avatar asked Feb 27 '11 07:02

J Cooper


1 Answers

gasche's answer has good advice. I'm going to explain open and closed unions a bit more.

First, you need to distinguish the two kinds of unions: basic variants (no backtick) and polymorphic variants (with backtick).

  • Basic variants are generative: if you define two types with the same constructor names in different modules M1 and M2, you have different types. M1.Foo and M2.Foo are different constructors. `Foo is always the same constructor no matter where you use it.
  • Apart from this, polymorphic variants can do everything basic variants can do and more. But with great power comes great complexity, so you should use them only when necessary and carefully.

A polymorphic variant type describes what constructors the type may have. But many polymorphic variant types are not fully known — they contain (row) type variables. Consider the empty list []: its type is 'a list, and it can be used in many contexts that assign more specific types to 'a. For example:

# let empty_list = [];;
val empty_list : 'a list = []
# let list_of_lists = [] :: empty_list;;
val list_of_lists : 'a list list = [[]]
# let list_of_integers = 3 :: empty_list;;
val list_of_integers : int list = [3]

The same holds for the row type variables. An open type, written [> … ], has a row variable that can be instantiated to cover more constructors each time the value is used.

# let foo = `Foo;;
val foo : [> `Foo ] = `Foo
# let bar = `Bar;;
val bar : [> `Bar ] = `Bar
# let foobar = [foo; bar];;
val foobar : [> `Bar | `Foo ] list = [`Foo; `Bar]

Just because a constructor appears in a type doesn't mean every use of that type has to allow all constructors. [> …] says that a type must have at least these constructors, and dually [< …] says that a type must have at most these constructors. Consider this function:

# let h = function `Foo -> `Bar | `Bar -> `Foo;;
val h : [< `Bar | `Foo ] -> [> `Bar | `Foo ] = <fun>

h is only capable of handling Foo and Bar, so the input type may not allow other constructors; but it's ok to call h on a type that only allows Foo. Conversely, h may return Foo or Bar, and any context where h is used must allow both Foo and Bar (and may allow others).

Closed types arise when there are matching minimum and maximum constructor requirements on a type. For example, let's add the constraint that h must have the same input and output type:

# let hh : 'a -> 'a = function `Foo -> `Bar | `Bar -> `Foo;;
val hh : [ `Bar | `Foo ] -> [ `Bar | `Foo ] = <fun>

Closed types rarely arise naturally from type inference. Most of the time, like here, they are a consequence of a user annotation. When you use polymorphic annotations, it's a good idea to define named types and use them at least on every toplevel function. Otherwise the inferred type is likely to be a little more general than you thought. While this rarely opens the way to bugs, it often means that any type error will be diagnosed later than it could have been, and will generate a very long error message where it will be hard to find the helpful bits.

I recommend reading and working through (i.e. retype the examples in the toplevel, play around a bit to make sure you understand each step) the polymorphic variant tutorial in the Ocaml manual.

like image 97
Gilles 'SO- stop being evil' Avatar answered Oct 11 '22 15:10

Gilles 'SO- stop being evil'