Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell: Why does pattern matching work for types without being instances of equality?

I was wondering how pattern matching in Haskell works. I am aware of this thread, but could not quite understand the answers therein.

  • The answers state that types are matched by Boolean expressions, but how is this possible?
  • The other thing I got was pattern matching is more general than (==) and Eq is implemented by use of pattern matching.

Can you tell me why match and match3 are working even if I omit deriving (Eq) in the following code snippet, (it is clear why match2 is failing)?

data MyType = TypeA | TypeB
            deriving (Eq)

match :: MyType -> String
match TypeA = "this is type A"
match TypeB = "this is type B"

match2 :: MyType -> String
match2 a | a == TypeA = "this is type A matched by equality"
         | a == TypeB = "this is type B matched by equality"
         | otherwise = "this is neither type A nor type B"

match3 :: MyType -> String
match3 a = case a of TypeA -> "this is type A matched by case expression"
                     TypeB -> "this is type B matched by case expression"

main :: IO ()
main = do
    (print . match) TypeA
    (print . match) TypeB
    (print . match2) TypeA
    (print . match2) TypeB
    (print . match3) TypeA
    (print . match3) TypeB
like image 344
epsilonhalbe Avatar asked Jun 18 '12 19:06

epsilonhalbe


People also ask

How does Haskell pattern matching work?

Pattern matching consists of specifying patterns to which some data should conform and then checking to see if it does and deconstructing the data according to those patterns. When defining functions, you can define separate function bodies for different patterns.

Does Haskell have pattern matching?

We use pattern matching in Haskell to simplify our codes by identifying specific types of expression. We can also use if-else as an alternative to pattern matching. Pattern matching can also be seen as a kind of dynamic polymorphism where, based on the parameter list, different methods can be executed.

What is the use of pattern matching?

Pattern matching is used to determine whether source files of high-level languages are syntactically correct. It is also used to find and replace a matching pattern in a text or code with another text/code. Any application that supports search functionality uses pattern matching in one way or another.

Why is Xs used in Haskell?

x:xs is simply another notation for lists in Haskell, where x represents the head of the list and xs (pronounced exes) represents the tail. So, again for the list [1, 2, 3], x would be 1, whilst xs would be [2, 3].


2 Answers

I just want to point out that data types and pattern matching (to a first approximation) are merely useful but redundant language features, that can be implemented purely using lambda calculus. If you understand how to implement them in lambda calculus, you can understand why they don't need Eq to implement pattern matching.

Implementing data types in lambda calculus is known as "Church-encoding" them (after Alonzo Church, who demonstrated how to do this). Church-encoded functions are also known as "Continuation-passing style".

It's called "continuation-passing style" because instead of providing the value, you provide a function that will process the value. So for example, instead of giving a user an Int, I could instead give them a value of the following type:

type IndirectInt = forall x . (Int -> x) -> x

The above type is "isomorphic" to an Int. "Isomorphic" is just a fancy way of saying that we can convert any IndirectInt into an Int:

fw :: IndirectInt -> Int
fw indirect = indirect id

... and we can convert any Int into an IndirectInt:

bw :: Int -> IndirectInt
bw int = \f -> f int

... such that:

fw . bw = id -- Exercise: Prove this
bw . fw = id -- Exercise: Prove this

Using continuation-passing style, we can convert any data type into a lambda-calculus term. Let's start with a simple type like:

data Either a b = Left a | Right b

In continuation-passing style, this would become:

type IndirectEither a b = forall x . (Either a b -> x) -> x

But Alonzo Church was smart and noticed that for any type with multiple constructors, we can just provide a separate function for each constructor. So in the case of the above type, instead of providing a function of type (Either a b -> x), we can instead provide two separate functions, one for the a, and one for the b, and that would be just as good:

type IndirectEither a b = forall x . (a -> x) -> (b -> x) -> x
-- Exercise: Prove that this definition is isomorphic to the previous one

What about a type like Bool where the constructors have no arguments? Well, Bool is isomorphic to Either () () (Exercise: Prove this), so we can just encode it as:

type IndirectBool = forall x . (() -> x) -> (() -> x) -> x

And () -> x is just isomorphic to x (Exercise: Prove this), so we can further rewrite it as:

type IndirectBool = forall x . x -> x -> x

There are only two functions that can have the above type:

true :: a -> a -> a
true a _ = a

false :: a -> a -> a
false _ a = a

Because of the isomorphism, we can guarantee that all Church encodings will have as many implementations as there were possible values of the original data type. So it's no coincidence that there are exactly two functions that inhabit IndirectBool, just like there are exactly two constructors for Bool.

But how do we pattern-match on our IndirectBool? For example, with an ordinary Bool, we could just write:

expression1 :: a
expression2 :: a

case someBool of
    True  -> expression1
    False -> expression2

Well, with our IndirectBool it already comes with the tools to deconstruct itself. We would just write:

myIndirectBool expression1 expression2

Notice that if myIndirectBool is true, it will pick the first expression, and if it is false it will pick the second expression, just as if we had somehow pattern-matched on its value.

Let's try to do the same thing with an IndirectEither. Using an ordinary Either, we'd write:

f :: a -> c
g :: b -> c

case someEither of
    Left  a -> f a
    Right b -> g b

With the IndirectEither, we'd just write:

someIndirectEither f g

In short, when we write types in continuation-passing style, the continuations are like the case statements of a case construct, so all we are doing is passing each different case statement as arguments to the function.

This is the reason you don't need any sense of Eq to pattern-match on a type. In lambda calculus, the type decides what it is "equal" to, simply by defining which argument it selects out of the ones provided to it.

So if I'm a true, I prove I am "equal" to true by always selecting my first argument. If I'm a false, I prove I am "equal" to false by always selecting my second argument. In short, constructor "equality" boils down to "positional equality", which is always defined in a lambda calculus, and if we can distinguish one parameter as the "first" and another as the "second", that's all we need to have the ability to "compare" constructors.

like image 145
Gabriella Gonzalez Avatar answered Sep 23 '22 08:09

Gabriella Gonzalez


First of all, match and match3 are actually exactly the same (ignoring the different strings): pattern matching in functions is desugared to a case statement.

Next, pattern matching works on constructors rather than arbitrary values. Pattern matching is built into the language and does not depend on boolean expressions--it just acts on the constructors directly. This is most evident with more complex matches that include some matchable terms:

f :: MyType -> Int
f (A a) = a + 1
f (B a b) = a + b

How would you rewrite these patterns into boolean expressions? You can't (without knowing anything else about MyType).

Instead, the pattern matching just goes by constructor. Each pattern has to be headed by a constructor--you can't write a pattern like f (a b c) in Haskell. Then, when the function gets a value, it can look at the value's constructor and jump to the appropriate cases immediately. This is the way it has to work for more complicated patterns (like A a), and is also the way it works for the simple patterns you used.

Since pattern matching just works by going to the appropriate constructor, it does not depend on Eq at all. Not only do you not have to have an Eq instance to pattern match, but having one also does not change how pattern matching behaves. For example, try this:

data MyType = TypeA | TypeB | TypeC

instance Eq MyType where 
  TypeA == TypeA = True
  TypeB == TypeC = True
  TypeC == TypeB = True
  _ == _         = False

match :: MyType → String
match TypeA = "this is type A"
match TypeB = "this is type B"
match TypeC = "this is type C"

match2 :: MyType → String
match2 a | a == TypeA = "this is type A matched by equality"
         | a == TypeC = "this is type B matched by equality"
         | a == TypeB = "this is type C matched by equality"
         | otherwise = "this is neither type A nor type B"

Now you've defined equality such that TypeB is equal to TypeC but not to itself. (In real life, you should ensure that equality behaves reasonably and follows the reflexive property, but this is a toy example.) Now, if you use pattern matching, TypeB still matches TypeB and TypeC matches TypeC. But if you use your guard expressions, TypeB actually matches TypeC and TypeC matches TypeB. TypeA is unchanged between the two.

Moreover, note how the Eq instance was defined using pattern matching. When you use a deriving clause, it gets defined in a similar way with code generated at compile time. So pattern matching is more fundamental than Eq--it is part of the language where Eq is just part of the standard library.

In summary: pattern matching is built into the language and works by comparing the constructor and then recursively matching on the rest of the value. Equality is usually implemented in terms of pattern matching and compares the whole value rather than just the constructor.

like image 43
Tikhon Jelvis Avatar answered Sep 21 '22 08:09

Tikhon Jelvis