Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are variable names chosen in type signatures inferred by GHC?

Tags:

haskell

ghc

When I play with checking types of functions in Haskell with :t, for example like those in my previous question, I tend to get results such as:

Eq a => a -> [a] -> Bool
(Ord a, Num a, Ord a1, Num a1) => a -> a1 -> a
(Num t2, Num t1, Num t, Enum t2, Enum t1, Enum t) =>  [(t, t1, t2)]

It seems that this is not such a trivial question - how does the Haskell interpreter pick literals to symbolize typeclasses? When would it choose a rather than t? When would it choose a1 rather than b? Is it important from the programmer's point of view?

like image 231
TheMP Avatar asked Jul 02 '14 16:07

TheMP


1 Answers

The names of the type variables aren't significant. The type:

Eq element => element -> [element] -> Bool

Is exactly the same as:

Eq a => a -> [a] -> Bool

Some names are simply easier to read/remember.

Now, how can an inferencer choose the best names for types?

Disclaimer: I'm absolutely not a GHC developer. However I'm working on a type-inferencer for Haskell in my bachelor thesis.

During inferencing the names chosen for the variables aren't probably that readable. In fact they are almost surely something along the lines of _N with N a number or aN with N a number.

This is due to the fact that you often have to "refresh" type variables in order to complete inferencing, so you need a fast way to create new names. And using numbered variables is pretty straightforward for this purpose.

The names displayed when inference is completed can be "pretty printed". The inferencer can rename the variables to use a, b, c and so on instead of _1, _2 etc.

The trick is that most operations have explicit type signatures. Some definitions require to quantify some type variables (class, data and instance for example). All these names that the user explicitly provides can be used to display the type in a better way.

When inferencing you can somehow keep track of where the fresh type variables came from, in order to be able to rename them with something more sensible when displaying them to the user. An other option is to refresh variables by adding a number to them. For example a fresh type of return could be Monad m0 => a0 -> m0 a0 (Here we know to use m and a simply because the class definition for Monad uses those names). When inferencing is finished you can get rid of the numbers and obtain the pretty names.

In general the inferencer will try to use names that were explicitly provided through signatures. If such a name was already used it might decide to add a number instead of using a different name (e.g. use b1 instead of c if b was already bound).

There are probably some other ad hoc rules. For example the fact that tuple elements have like t, t1, t2, t3 etc. is probably something done with a custom rule. In fact t doesn't appear in the signature for (,,) for example.

like image 180
Bakuriu Avatar answered Sep 22 '22 12:09

Bakuriu