Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pattern matching on rank-2 type

Tags:

haskell

ghc

I'm trying to understand why one version of this code compiles, and one version does not.

{-# LANGUAGE RankNTypes, FlexibleContexts #-}

module Foo where

import Data.Vector.Generic.Mutable as M
import Data.Vector.Generic as V
import Control.Monad.ST
import Control.Monad.Primitive

data DimFun v m r = 
  DimFun {dim::Int, func :: v (PrimState m) r -> m ()}

runFun1 :: (Vector v r, MVector (Mutable v) r) => 
  (forall m . (PrimMonad m) => DimFun (Mutable v) m r) -> v r -> v r
runFun1 (DimFun dim t) x | V.length x == dim = runST $ do
  y <- thaw x
  t y
  unsafeFreeze y

runFun2 :: (Vector v r, MVector (Mutable v) r) => 
  (forall m . (PrimMonad m) => DimFun (Mutable v) m r) -> v r -> v r
runFun2 t x = runST $ do
  y <- thaw x
  evalFun t y
  unsafeFreeze y

evalFun :: (PrimMonad m, MVector v r) => DimFun v m r -> v (PrimState m) r -> m ()
evalFun (DimFun dim f) y | dim == M.length y = f y

runFun2 compiles fine (GHC-7.8.2), but runFun1 results in errors:

Could not deduce (PrimMonad m0) arising from a pattern
from the context (Vector v r, MVector (Mutable v) r)
  bound by the type signature for
             tfb :: (Vector v r, MVector (Mutable v) r) =>
                    (forall (m :: * -> *). PrimMonad m => TensorFunc m r) -> v r -> v r
  at Testing/Foo.hs:(26,8)-(28,15)
The type variable ‘m0’ is ambiguous
Note: there are several potential instances:
  instance PrimMonad IO -- Defined in ‘Control.Monad.Primitive’
  instance PrimMonad (ST s) -- Defined in ‘Control.Monad.Primitive’
In the pattern: TensorFunc _ f
In an equation for ‘tfb’:
    tfb (TensorFunc _ f) x
      = runST
        $ do { y <- thaw x;
               f y;
               unsafeFreeze y }

Couldn't match type ‘m0’ with ‘ST s’
  because type variable ‘s’ would escape its scope
This (rigid, skolem) type variable is bound by
  a type expected by the context: ST s (v r)
  at Testing/Foo.hs:(29,26)-(32,18)
Expected type: ST s ()
  Actual type: m0 ()
Relevant bindings include
  y :: Mutable v s r (bound at Testing/Foo.hs:30:3)
  f :: forall (v :: * -> * -> *).
       MVector v r =>
       v (PrimState m0) r -> m0 ()
    (bound at Testing/Foo.hs:29:19)
In a stmt of a 'do' block: f y
In the second argument of ‘($)’, namely
  ‘do { y <- thaw x;
        f y;
        unsafeFreeze y }’

Could not deduce (s ~ PrimState m0)
from the context (Vector v r, MVector (Mutable v) r)
  bound by the type signature for
             tfb :: (Vector v r, MVector (Mutable v) r) =>
                    (forall (m :: * -> *). PrimMonad m => TensorFunc m r) -> v r -> v r
  at Testing/Foo.hs:(26,8)-(28,15)
  ‘s’ is a rigid type variable bound by
      a type expected by the context: ST s (v r) at Testing/Foo.hs:29:26
Expected type: Mutable v (PrimState m0) r
  Actual type: Mutable v s r
Relevant bindings include
  y :: Mutable v s r (bound at Testing/Foo.hs:30:3)
  f :: forall (v :: * -> * -> *).
       MVector v r =>
       v (PrimState m0) r -> m0 ()
    (bound at Testing/Foo.hs:29:19)
In the first argument of ‘f’, namely ‘y’
In a stmt of a 'do' block: f y

I'm pretty sure the rank-2 type is to blame, possibly caused by a monomorphism restriction. However, as suggested in a previous question of mine, I enabled -XNoMonomorphismRestriction, but got the same error.

What is the difference between these seemingly identical code snippets?

like image 684
crockeea Avatar asked Jul 14 '14 19:07

crockeea


2 Answers

I think that having a rough mental model of the type-level plumbing involved here is essential, so I'm going go talk about "implicit things" in a bit more detail, and scrutinize your problem only after that. Readers only interested in the direct solution to the question may skip to the "Pattern matching on polymorhpic values" subsection and the end.

1. Implicit function arguments

Type arguments

GHC compiles Haskell to a small intermediate language called Core, which is essentially a rank-n polymorphic typed lambda calculus called System F (plus some extensions). Below I am going use Haskell alongside a notation somewhat resembling Core; I hope it's not overly confusing.

In Core, polymorphic functions are functions which take types as additional arguments, and arguments further down the line can refer to those types or have those types:

-- in Haskell
const :: forall (a :: *) (b :: *). a -> b -> a
const x y = x

-- in pseudo-Core
const' :: (a :: *) -> (b :: *) -> a -> b -> a
const' a b x y = x 

This means that we must also supply type arguments to these functions whenever we want to use them. In Haskell type inference usually figures out the type arguments and supplies them automatically, but if we look at the Core output (for example, see this introduction for how to do that), type arguments and applications are visible everywhere. Building a mental model of this makes figuring out higher-rank code a whole lot easier:

-- Haskell
poly :: (forall a. a -> a) -> b -> (Int, b)
poly f x = (f 0, f x)

-- pseudo-Core
poly' :: (b :: *) -> ((a :: *) -> a -> a) -> b -> (Int, b)
poly' b f x = (f Int 0, f b x)

And it makes clear why some things don't typecheck:

wrong :: (a -> a) -> (Int, Bool)
wrong f = (f 0, f True)

wrong' :: (a :: *) -> (a -> a) -> (Int, Bool)
wrong' a f = (f ?, f ?) -- f takes an "a", not Int or Bool. 

Class constraint arguments

-- Haskell
show :: forall a. Show a => a -> String
show x = show x

-- pseudo-Core
show' :: (a :: *) -> Show a -> a -> String
show' a (ShowDict showa) x = showa x 

What is ShowDict and Show a here? ShowDict is just a Haskell record containing a show instance, and GHC generates such records for each instance of a class. Show a is just the type of this instance record:

-- We translate classes to a record type:
class Show a where show :: a -> string

data Show a = ShowDict (show :: a -> String)

-- And translate instances to concrete records of the class type:
instance Show () where show () = "()"

showUnit :: Show ()
showUnit = ShowDict (\() -> "()")

For example, whenever we want to apply show, the compiler has to search the scope in order to find a suitable type argument and an instance dictionary for that type. Note that while instances are always top level, quite often in polymorphic functions the instances are passed in as arguments:

data Foo = Foo

-- instance Show Foo where show _ = "Foo"
showFoo :: Show Foo
showFoo = ShowDict (\_ -> "Foo")

-- The compiler fills in an instance from top level
fooStr :: String
fooStr = show' Foo showFoo Foo 

polyShow :: (Show a, Show b) => a -> b -> String
polyShow a b = show a ++ show b

-- Here we get the instances as arguments (also, note how (++) also takes an extra
-- type argument, since (++) :: forall a. [a] -> [a] -> [a])
polyShow' :: (a :: *) -> (b :: *) -> Show a -> Show b -> a -> b -> String
polyShow' a b (ShowDict showa) (ShowDict showb) a b -> (++) Char (showa a) (showb b) 

Pattern matching on polymorphic values

In Haskell, pattern matching on functions doesn't make sense. Polymorphic values can be also viewed as functions, but we can pattern match on them, just like in OP's erroneous runfun1 example. However, all the implicit arguments must be inferable in the scope, or else the mere act of pattern matching is a type error:

import Data.Monoid

-- it's a type error even if we don't use "a" or "n".
-- foo :: (forall a. Monoid a => (a, Int)) -> Int
-- foo (a, n) = 0 

foo :: ((a :: *) -> Monoid a -> (a, Int)) -> Int
foo f = ? -- What are we going to apply f to?

In other words, by pattern matching on a polymorphic value, we assert that all implicit arguments have been already applied. In the case of foo here, although there isn't a syntax for type application in Haskell, we can sprinkle around type annotations:

{-# LANGUAGE ScopedTypeVariables, RankNTypes #-}

foo :: (forall a. Monoid a => (a, Int)) -> Int
foo x = case (x :: (String, Int)) of (_, n) -> n

-- or alternatively
foo ((_ :: String), n) = n 

Again, pseudo-Core makes the situation clearer:

foo :: ((a :: *) -> Monoid a -> (a, Int)) -> Int
foo f = case f String monoidString of (_ , n) -> n 

Here monoidString is some available Monoid instance of String.

2. Implicit data fields

Implicit data fields usually correspond to the notion of "existential types" in Haskell. In a sense, they are dual to implicit function arguments with respect to term obligations:

  • When we construct functions, the implicit arguments are available in the function body.
  • When we apply functions, we have extra obligations to fulfill.
  • When we construct data with implicit fields, we must supply those extra fields.
  • When we pattern match on data, the implicit fields also come into scope.

Standard example:

{-# LANGUAGE GADTs #-}

data Showy where
    Showy :: forall a. Show a => a -> Showy

-- pseudo-Core
data Showy where
    Showy :: (a :: *) -> Show a -> a -> Showy

-- when constructing "Showy", "Show a" must be also available:
someShowy :: Showy
someShowy = Showy (300 :: Int)

-- in pseudo-Core
someShowy' = Showy Int showInt 300 

-- When pattern matching on "Showy", we get an instance in scope too
showShowy :: Showy -> String
showShowy (Showy x) = show x 

showShowy' :: Showy -> String
showShowy' (Showy a showa x) = showa x

3. Taking a look at OP's example

We have the function

runFun1 :: (Vector v r, MVector (Mutable v) r) => 
  (forall m . (PrimMonad m) => DimFun (Mutable v) m r) -> v r -> v r
runFun1 dfun@(DimFun dim t) x | V.length x == dim = runST $ do
    y <- thaw x
    t y
    unsafeFreeze y

Remember that pattern matching on polymorphic values asserts that all implicit arguments are available in the scope. Except that here, at the point of pattern matching there is no m at all in scope, let alone a PrimMonad instance for it.

With GHC 7.8.x it's is good practice to use type holes liberally:

runFun1 :: (Vector v r, MVector (Mutable v) r) => 
  (forall m . (PrimMonad m) => DimFun (Mutable v) m r) -> v r -> v r
runFun1 (DimFun dim t) x | V.length x == dim = _

Now GHC will duly display the type of the hole, and also the types of the variables in the context. We can see that t has type Mutable v (PrimState m0) r -> m0 (), and we also see that m0 is not listed as bound anywhere. Indeed, it is a notorious "ambiguous" type variable conjured up by GHC as a placeholder.

So, why don't we try manually supplying the arguments, just as in the prior example with the Monoid instance? We know that we will use t inside an ST action, so we can try fixing m as ST s and GHC automatically applies the PrimMonad instance for us:

runFun1 :: forall v r. (Vector v r, MVector (Mutable v) r) => 
  (forall m . (PrimMonad m) => DimFun (Mutable v) m r) -> v r -> v r
runFun1 (DimFun dim (t :: Mutable v s r -> ST s ())) x 
    | V.length x == dim = runST $ do 
        y <- thaw x
        t y
        unsafeFreeze y

... except it doesn't work and we get the error "Couldn't match type ‘s’ with ‘s1’ because type variable ‘s1’ would escape its scope".

It turns out - comes as no surprise - that we've forgotten about yet another implicit argument. Recall the type of runST:

runST :: (forall s. ST s a) -> a

We can imagine that runST takes a function of type ((s :: PrimState ST) -> ST s a), and then our code looks like this:

runST $ \s -> do
    y <- thaw x   -- y :: Mutable v s r
    t y           -- error: "t" takes a "Mutable v s r" with a different "s". 
    unsafeFreeze y 

The s in t's argument type is silently introduced at the outermost scope:

runFun1 :: forall v s r. ...

And thus the two s-es are distinct.

A possible solution is to pattern match on the DimFun argument inside the ST action. There, the correct s is in scope, and GHC can supply ST s as m:

runFun1 :: forall v r. (Vector v r, MVector (Mutable v) r) => 
    (forall m . PrimMonad m => DimFun (Mutable v) m r) -> v r -> v r
runFun1 dimfun x = runST $ do
    y <- thaw x
    case dimfun of
        DimFun dim t | dim == M.length y -> t y
    unsafeFreeze y

With some parameters made explicit:

runST $ \s -> do
    y <- thaw x
    case dimfun (ST s) primMonadST of
         DimFun dim t | dim == M.length y -> t y
    unsafeFreeze y 

As an exercise, let's convert all of the function to pseudo-Core (but let's not desugar the do syntax, because that would be way too ugly):

-- the full types of the functions involved, for reference
thaw :: forall m v a. (PrimMonad m, V.Vector v a) => v a -> m (V.Mutable v (PrimState m) a)
runST :: forall a. (forall s. ST s a) -> a
unsafeFreeze :: forall m v a. (PrimMonad m, Vector v a) => Mutable v (PrimState m) a -> v a 
M.length :: forall v s a. MVector v s a -> Int
(==) :: forall a. Eq a => a -> a -> Bool

runFun1 :: 
    (v :: * -> *) -> (r :: *) 
    -> Vector v r -> MVector (Mutable v) r
    -> ((m :: (* -> *)) -> PrimMonad m -> DimFun (Mutable v) m r)
    -> v r -> v r
runFun1 v r vecInstance mvecInstance dimfun x = runST r $ \s -> do
    y <- thaw (ST s) v r primMonadST vecInstance x
    case dimFun (ST s) primMonadST of
        DimFun dim t | (==) Int eqInt dim (M.length v s r y) -> t y
    unsafeFreeze (ST s) v r primMonadST vecInstance y

That was a mouthful.

Now we are well-equipped to explain why runFun2 worked:

runFun2 :: (Vector v r, MVector (Mutable v) r) => 
  (forall m . (PrimMonad m) => DimFun (Mutable v) m r) -> v r -> v r
runFun2 t x = runST $ do
  y <- thaw x
  evalFun t y
  unsafeFreeze y

evalFun :: (PrimMonad m, MVector v r) => DimFun v m r -> v (PrimState m) r -> m ()
evalFun (DimFun dim f) y | dim == M.length y = f y

evalFun is just a polymorphic function that gets called in the right place (we ultimately pattern match on t in the right place), where the correct ST s is available as the m argument.

As a type system gets more sophisticated, pattern matching becomes a progressively more serious affair, with far-reaching consequences and non-trivial requirements. At the end of the spectrum you find full-dependent languages and proof assistants such as Agda, Idris or Coq, where pattern matching on a piece of data can mean accepting an arbitrary logical proposition as true in a certain branch of your program.

like image 155
András Kovács Avatar answered Oct 05 '22 14:10

András Kovács


Though @AndrasKovacs gave a great answer, I think it is worth pointing out how to avoid this nastiness altogether. This answer to a related question by me shows how the "correct" definition for DimFun makes all of the rank-2 stuff go away.

By defining DimFun as

data DimFun v r = 
  DimFun {dim::Int, func :: forall s . (PrimMonad s) => v (PrimState s) r -> s ()}

runFun1 becomes:

runFun1 :: (Vector v r)
        => DimFun (Mutable v) r -> v r -> v r
runFun1 (DimFun dim t) x | dim == V.length x = runST $ do
  y <- thaw x
  t y
  unsafeFreeze y

and compiles without issue.

like image 22
crockeea Avatar answered Oct 05 '22 15:10

crockeea