Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why aren't Haskell variables polymorphic when bound by pattern matching?

Consider a variable introduced in a pattern, such as f in this Haskell example:

case (\x -> x) of f -> (f True, f 'c')

This code results in a type error ("Couldn't match expected type ‘Bool’ with actual type ‘Char’"), because of the two different uses of f. It shows that the inferred type of f is not polymorphic in Haskell.

But why shouldn't f be polymorphic?

I have two points of comparison: OCaml and "textbook" Hindley-Milner. Both suggest that f ought to be polymorphic.

  1. In OCaml, the analogous code is not an error:

    match (fun x -> x) with f -> (f true, f 'c')
    

    This evaluates to (true, 'c') with type bool * char. So it looks like OCaml gets along fine with assigning f a polymorphic type.

  2. We can gain clarity by stripping things down to the fundamentals of Hindley-Milner - lambda calculus with "let" - which both Haskell and OCaml are based on. When reduce to this core system, of course, there is no such thing as pattern matching. We can draw parallels though. Between "let" and "lambda", case expr1 of f -> expr2 is much closer to let f = expr1 in expr2 than to (lambda f. expr2) expr1. "Case", like "let", syntactically restricts f to be bound to expr1, while a function lambda f. expr2 doesn't know what f will be bound to since the function has no such restriction on where in the program it will be called. This was the reason why let-bound variables are generalized in Hindley-Milner and lambda-bound variables are not. It appears that the same reasoning that allows let-bound variables to be generalized shows that variables introduced by pattern matching could be generalized too.

The examples above are minimal for clarity, so they only show a trivial pattern f in the pattern matching, but all the same logic extends to arbitrarily complex patterns like Just (a:b:(x,y):_), which can introduce multiple variables that would all be generalized.

Is my analysis correct? In Haskell specifically - recognizing that it's not just plain Hindley-Milner and not OCaml - why don't we generalize the type of f in the first example?

Was this an explicit language design decision, and if so, what were the reasons? (I note that some in the community think that not even "let" should be generalized, but I would imagine the design decision pre-dates that paper.)

If variables introduced in a pattern were made polymorphic similar to "let", would that break compatibility with other aspects of Haskell in a significant way?

like image 713
echinodermata Avatar asked Jan 05 '19 07:01

echinodermata


2 Answers

If we assign a polymorphic type (forall x. t) to a case scrutinee, then it matches no non-trivial pattern, so there's no point for having case.

Could we generalize in some other useful way? Not really, because of GHC's lack of support for "impredicative" instantiation. In your example of Just (a:b:(x,y):_), not a single bound variable can have polymorphic type, since Maybe, (,), and [] cannot be instantiated with such types.

One thing works, as mentioned in the comments: data types with polymorphic fields, such as data Endo = Endo (forall a. a -> a). However, type checking for polymorphic fields doesn't technically involve a generalization step, nor does it behave like let-generalization.

In principle, generalization could be performed at many points, for example even at arbitrary function arguments (e.g. in f (\x -> x)). However, too much generalization clogs up type inference by introducing untractable higher-rank types; this can be also understood as eliminating useful type dependencies between different parts of the program by removing unsolved metavariables. Although there are systems which can handle higher-rank inference much better than GHC, most notably MLF, they're also much more complicated and haven't seen much practical use. I personally prefer to not have silent let-generalization at all.

like image 84
András Kovács Avatar answered Oct 03 '22 07:10

András Kovács


One first issue is that with type classes, generalization is not always free. Consider show :: forall a. Show a => a -> String and this expression:

case show of
  f -> ...

If you generalize f to f :: forall a. Show a => a -> String, then GHC will pass a Show dictionary at every call of f, instead of once at the single occurrence of show. In case there are multiple calls all at the same type, this duplicates work compared to not generalizing.

It is also not actually a generalization of the current type inference algorithm when combined with type classes: it can cause existing programs to no longer typecheck. For example,

    case show of
      f -> f () ++ f mempty

By not generalizing f, we can infer that mempty has type (). On the other hand, generalizing f :: forall a. Show a => a -> String will lose that connection, and the type of mempty in that expression will be ambiguous.

It is true though that these are minor issues, and maybe things would be mostly fine with some monomorphism restrictions, even if not entirely backwards compatible.

like image 21
Li-yao Xia Avatar answered Oct 03 '22 09:10

Li-yao Xia