Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does eqT returning Maybe (a :~: b) work better than it returning Bool?

Tags:

haskell

ghc

gadt

I made a variant of eqT that would allow me work with the result like any other Bool to write something like eqT' @a @T1 || eqT' @a @T2. However, while that worked well in some cases, I found that I couldn't replace every use of eqT with it. For example, I wanted to use it to write a variant of readMaybe that would just be Just when it was supposed to return a String. While using eqT allowed me to keep the type as String -> Maybe a, using eqT' requires the type to be String -> Maybe String. Why is that? I know that I can do this via other means, but I want to know why this doesn't work. I suppose this has something to do with special treatment in case expressions for GADTs (a :~: b being a GADT), but what is that special treatment?

Here is some example code of what I'm talking about:

{-# LANGUAGE GADTs #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeApplications #-}
{-# LANGUAGE AllowAmbiguousTypes #-}

import Data.Typeable
import Text.Read

eqT' :: forall a b. (Typeable a, Typeable b) => Bool
eqT' = case eqT @a @b of
    Just Refl -> True
    _ -> False

readMaybeWithBadType1 :: forall a. (Typeable a, Read a) => String -> Maybe String
readMaybeWithBadType1 = if eqT' @a @String
    then Just
    else readMaybe

readMaybeWithBadType2 :: forall a. (Typeable a, Read a) => String -> Maybe String
readMaybeWithBadType2 = case eqT' @a @String of
    True -> Just
    False -> readMaybe

readMaybeWithGoodType :: forall a. (Typeable a, Read a) => String -> Maybe a
readMaybeWithGoodType = case eqT @a @String of
    Just Refl -> Just
    _ -> readMaybe

main :: IO ()
main = return ()

Changing the type of either readMaybeWithBadType to return Maybe a results in ghc complaining:

u.hs:16:14: error:
    • Couldn't match type ‘a’ with ‘String’
      ‘a’ is a rigid type variable bound by
        the type signature for:
          readMaybeWithBadType1 :: forall a.
                                   (Typeable a, Read a) =>
                                   String -> Maybe a
        at u.hs:14:5-80
      Expected type: String -> Maybe a
        Actual type: a -> Maybe a
    • In the expression: Just
      In the expression: if eqT' @a @String then Just else readMaybe
      In an equation for ‘readMaybeWithBadType1’:
          readMaybeWithBadType1 = if eqT' @a @String then Just else readMaybe
    • Relevant bindings include
        readMaybeWithBadType1 :: String -> Maybe a (bound at u.hs:15:5)
   |
16 |         then Just
   |              ^^^^

u.hs:21:17: error:
    • Couldn't match type ‘a’ with ‘String’
      ‘a’ is a rigid type variable bound by
        the type signature for:
          readMaybeWithBadType2 :: forall a.
                                   (Typeable a, Read a) =>
                                   String -> Maybe a
        at u.hs:19:5-80
      Expected type: String -> Maybe a
        Actual type: a -> Maybe a
    • In the expression: Just
      In a case alternative: True -> Just
      In the expression:
        case eqT' @a @String of
          True -> Just
          False -> readMaybe
    • Relevant bindings include
        readMaybeWithBadType2 :: String -> Maybe a (bound at u.hs:20:5)
   |
21 |         True -> Just
   |                 ^^^^

I kind of get why it's complaining, but I don't see why it isn't a problem in readMaybeWithGoodType.

like image 896
JoL Avatar asked Oct 25 '18 19:10

JoL


3 Answers

Essentially, this is a case of GADT vs non-GADT elimination.

When we want to use a value x :: T where T is some algebraic data type, we resort to pattern matching (AKA "elimination")

case x of
  K1 ... -> e1
  K2 ... -> e2
  ...

where the Ki cover all the possible constructors.

Sometimes, instead of using case we use other forms of pattern matching (e.g. defining equations) but that's irrelevant. Also, if then else is completely equivalent to case of True -> .. ; False -> ..., so there's no need to discuss this.

Now, the crucial point is that the type T we are eliminating could be a GADT or not.

If it is not a GADT, then all the branches e1,e2,... are type checked, and they are required to have the same type. This is done without exploiting any additional type information.

If we write case eqT' @a @b of ... or if eqT' @a @b then ... we are eliminating type Bool which is not a GADT. No information is obtained about a,b by the type checker, and the two branches are checked to have the same type (which may fail).

Instead, if T is a GADT, the type checker exploits further type information. In particular, if we have case x :: a :~: b of Refl -> e the type checker learns a~b, and exploits that when type checking e.

If we have multiple branches like

case x :: Maybe (a :~: b) of
   Just Refl -> e1
   Nothing   -> e2

then a~b is only used for e1, as intuition suggests.

If you want a custom eqT', I suggest you try this:

data Eq a b where
   Equal   :: Eq a a
   Unknown :: Eq a b

eqT' :: forall a b. (Typeable a, Typeable b) => Eq a b
eqT' = case eqT @a @b of
   Just Refl -> Equal
   Nothing   -> Unknown

readMaybe3 :: forall a. (Typeable a, Read a) => String -> Maybe String
readMaybe3 = case eqT' @a @String of
    Equal -> Just
    Unknown -> readMaybe

The trick is eliminating a GADT which provides the right information about the type variables at hand, like in this case.

If you want to go deeper, you can check out languages with full dependent types (Coq, Idris, Agda, ...) where we find a similar behavior in dependent vs non-dependent elimination. These languages are a bit harder than Haskell+GADTs -- I have to warn you. I'll only add that dependent elimination was mysterious at first to me. After I understood the general form of pattern matching in Coq, everything started to make a lot of sense.

like image 125
chi Avatar answered Sep 29 '22 13:09

chi


Thanks to bergey and chi, I think I understand the series of steps that caused GHC to return that error to me. They're both good answers, but I think a good deal of my misunderstanding was not understanding the concrete steps Haskell takes to type check and how it relates, in this case, to GADT pattern matching. I'm going to write an answer that describes this to the best of my understanding.

So, to start, one of the things that makes GADTs GADTs is that you can define a sum-type where each option can be of a different type that's more specific than the type given in the head of the data declaration. That makes the following possible:

data a :~: b where
  Refl :: a :~: a

So here we only have one constructor, Refl, which is an a :~: b, but more specifically, this particular constructor (albeit the only one) results in an a :~: a. If we compose that with Maybe to get the type Maybe (a :~: b), then we have 2 possible values: Just Refl :: Maybe (a :~: a) and Nothing :: Maybe (a :~: b). That's how the type carries the information on type equality by pattern matching.

Now, to make GADTs work with pattern matching, something cool must be done. That's that the expressions matched with each pattern may be more specialized than the whole of the pattern matching expression (e.g. case expressions). Using the added type refinement included in a GADT constructor to further specialize the type required of the matching expression is the special treatment Haskell does for GADTs in pattern matching. So when we do:

readMaybeWithGoodType :: forall a. (Typeable a, Read a) => String -> Maybe a
readMaybeWithGoodType = case eqT @a @String of
    Just Refl -> Just
    _ -> readMaybe

eqT is Maybe (a :~: b), eqT @a @String and the matched _ is (Typeable a, Read a) => Maybe (a :~: String), but Just Refl is Maybe (String :~: String).

Haskell will require the whole case expression to be a type superset of (Typeable a, Read a) => String -> Maybe a. The _ match which is just readMaybe is type Read a => String -> Maybe a which is a superset. However, Just is type a -> Maybe a, which is not a superset, because the case expression should include things like String -> Maybe Int but Just can't return that because it would need for String ~ Int. This is what was happening when matching with Bool. GHC told that it couldn't match the Maybe String Just would return with the more general Read a => Maybe a that was required of it.

This is where pattern matching on a constructor that includes this type equality information is important. By matching on Just Refl :: Maybe (String :~: String), Haskell won't need that matching expression to be of a type superset of (Typeable a, Read a) => String -> Maybe a, it just needs it to be a superset of String -> Maybe String (a subset type of the original required type), which it is by being a -> Maybe a.

like image 22
JoL Avatar answered Sep 29 '22 12:09

JoL


You have discovered what the documentation describes as

To use this equality in practice, pattern-match on the a :~: b to get out the Refl constructor; in the body of the pattern-match, the compiler knows that a ~ b.

In most case matches on Maybe a, in the Just branch we have an additional value if type a that we can use. Here too, in the Just branch of readMaybeWithGoodType, there is an additional value. Refl isn't very interesting at the term level, but at the type level it is. Here it conveys to GHC a fact that we know by inspection - that this branch is reachable if and only if a is String.

You are right that other GADT constructors can also bring type information (typically type class constraints on their arguments) into scope.

like image 38
bergey Avatar answered Sep 29 '22 13:09

bergey