I was wondering how pattern matching in Haskell works. I am aware of this thread, but could not quite understand the answers therein. <ul> <li>The answers state that types are matched by Boolean expressions, but how is this possible?</li> <li>The other thing I got was pattern matching is more general than <code>(==)</code> and <code>Eq</code> is implemented by use of pattern matching.</li> </ul> Can you tell me why <code>match</code> and <code>match3</code> are working even if I omit <code>deriving (Eq)</code> in the following code snippet, (it is clear why <code>match2</code> is failing)? <pre class="prettyprint"><code>data MyType = TypeA | TypeB deriving (Eq) match :: MyType -> String match TypeA = "this is type A" match TypeB = "this is type B" match2 :: MyType -> String match2 a | a == TypeA = "this is type A matched by equality" | a == TypeB = "this is type B matched by equality" | otherwise = "this is neither type A nor type B" match3 :: MyType -> String match3 a = case a of TypeA -> "this is type A matched by case expression" TypeB -> "this is type B matched by case expression" main :: IO () main = do (print . match) TypeA (print . match) TypeB (print . match2) TypeA (print . match2) TypeB (print . match3) TypeA (print . match3) TypeB </code></pre>

I just want to point out that data types and pattern matching (to a first approximation) are merely useful but redundant language features, that can be implemented purely using lambda calculus. If you understand how to implement them in lambda calculus, you can understand why they don't need <code>Eq</code> to implement pattern matching. Implementing data types in lambda calculus is known as "Church-encoding" them (after Alonzo Church, who demonstrated how to do this). Church-encoded functions are also known as "Continuation-passing style". It's called "continuation-passing style" because instead of providing the value, you provide a function that will process the value. So for example, instead of giving a user an <code>Int</code>, I could instead give them a value of the following type: <pre class="prettyprint"><code>type IndirectInt = forall x . (Int -> x) -> x </code></pre> The above type is "isomorphic" to an <code>Int</code>. "Isomorphic" is just a fancy way of saying that we can convert any <code>IndirectInt</code> into an <code>Int</code>: <pre class="prettyprint"><code>fw :: IndirectInt -> Int fw indirect = indirect id </code></pre> ... and we can convert any <code>Int</code> into an <code>IndirectInt</code>: <pre class="prettyprint"><code>bw :: Int -> IndirectInt bw int = \f -> f int </code></pre> ... such that: <pre class="prettyprint"><code>fw . bw = id -- Exercise: Prove this bw . fw = id -- Exercise: Prove this </code></pre> Using continuation-passing style, we can convert any data type into a lambda-calculus term. Let's start with a simple type like: <pre class="prettyprint"><code>data Either a b = Left a | Right b </code></pre> In continuation-passing style, this would become: <pre class="prettyprint"><code>type IndirectEither a b = forall x . (Either a b -> x) -> x </code></pre> But Alonzo Church was smart and noticed that for any type with multiple constructors, we can just provide a separate function for each constructor. So in the case of the above type, instead of providing a function of type <code>(Either a b -> x)</code>, we can instead provide two separate functions, one for the <code>a</code>, and one for the <code>b</code>, and that would be just as good: <pre class="prettyprint"><code>type IndirectEither a b = forall x . (a -> x) -> (b -> x) -> x -- Exercise: Prove that this definition is isomorphic to the previous one </code></pre> What about a type like <code>Bool</code> where the constructors have no arguments? Well, <code>Bool</code> is isomorphic to <code>Either () ()</code> (Exercise: Prove this), so we can just encode it as: <pre class="prettyprint"><code>type IndirectBool = forall x . (() -> x) -> (() -> x) -> x </code></pre> And <code>() -> x</code> is just isomorphic to <code>x</code> (Exercise: Prove this), so we can further rewrite it as: <pre class="prettyprint"><code>type IndirectBool = forall x . x -> x -> x </code></pre> There are only two functions that can have the above type: <pre class="prettyprint"><code>true :: a -> a -> a true a _ = a false :: a -> a -> a false _ a = a </code></pre> Because of the isomorphism, we can guarantee that all Church encodings will have as many implementations as there were possible values of the original data type. So it's no coincidence that there are exactly two functions that inhabit <code>IndirectBool</code>, just like there are exactly two constructors for <code>Bool</code>. But how do we pattern-match on our <code>IndirectBool</code>? For example, with an ordinary <code>Bool</code>, we could just write: <pre class="prettyprint"><code>expression1 :: a expression2 :: a case someBool of True -> expression1 False -> expression2 </code></pre> Well, with our <code>IndirectBool</code> it already comes with the tools to deconstruct itself. We would just write: <pre class="prettyprint"><code>myIndirectBool expression1 expression2 </code></pre> Notice that if <code>myIndirectBool</code> is <code>true</code>, it will pick the first expression, and if it is <code>false</code> it will pick the second expression, just as if we had somehow pattern-matched on its value. Let's try to do the same thing with an <code>IndirectEither</code>. Using an ordinary <code>Either</code>, we'd write: <pre class="prettyprint"><code>f :: a -> c g :: b -> c case someEither of Left a -> f a Right b -> g b </code></pre> With the <code>IndirectEither</code>, we'd just write: <pre class="prettyprint"><code>someIndirectEither f g </code></pre> In short, when we write types in continuation-passing style, the continuations are like the case statements of a case construct, so all we are doing is passing each different case statement as arguments to the function. This is the reason you don't need any sense of <code>Eq</code> to pattern-match on a type. In lambda calculus, the type decides what it is "equal" to, simply by defining which argument it selects out of the ones provided to it. So if I'm a <code>true</code>, I prove I am "equal" to <code>true</code> by always selecting my first argument. If I'm a <code>false</code>, I prove I am "equal" to <code>false</code> by always selecting my second argument. In short, constructor "equality" boils down to "positional equality", which is always defined in a lambda calculus, and if we can distinguish one parameter as the "first" and another as the "second", that's all we need to have the ability to "compare" constructors.

Haskell: Why does pattern matching work for types without being instances of equality?

Tags:

types

pattern-matching

haskell

I was wondering how pattern matching in Haskell works. I am aware of this thread, but could not quite understand the answers therein.

The answers state that types are matched by Boolean expressions, but how is this possible?
The other thing I got was pattern matching is more general than (==) and Eq is implemented by use of pattern matching.

Can you tell me why match and match3 are working even if I omit deriving (Eq) in the following code snippet, (it is clear why match2 is failing)?

data MyType = TypeA | TypeB
            deriving (Eq)

match :: MyType -> String
match TypeA = "this is type A"
match TypeB = "this is type B"

match2 :: MyType -> String
match2 a | a == TypeA = "this is type A matched by equality"
         | a == TypeB = "this is type B matched by equality"
         | otherwise = "this is neither type A nor type B"

match3 :: MyType -> String
match3 a = case a of TypeA -> "this is type A matched by case expression"
                     TypeB -> "this is type B matched by case expression"

main :: IO ()
main = do
    (print . match) TypeA
    (print . match) TypeB
    (print . match2) TypeA
    (print . match2) TypeB
    (print . match3) TypeA
    (print . match3) TypeB

344

asked Jun 18 '12 19:06

epsilonhalbe

2 Answers

I just want to point out that data types and pattern matching (to a first approximation) are merely useful but redundant language features, that can be implemented purely using lambda calculus. If you understand how to implement them in lambda calculus, you can understand why they don't need Eq to implement pattern matching.

Implementing data types in lambda calculus is known as "Church-encoding" them (after Alonzo Church, who demonstrated how to do this). Church-encoded functions are also known as "Continuation-passing style".

It's called "continuation-passing style" because instead of providing the value, you provide a function that will process the value. So for example, instead of giving a user an Int, I could instead give them a value of the following type:

type IndirectInt = forall x . (Int -> x) -> x

The above type is "isomorphic" to an Int. "Isomorphic" is just a fancy way of saying that we can convert any IndirectInt into an Int:

fw :: IndirectInt -> Int
fw indirect = indirect id

... and we can convert any Int into an IndirectInt:

bw :: Int -> IndirectInt
bw int = \f -> f int

... such that:

fw . bw = id -- Exercise: Prove this
bw . fw = id -- Exercise: Prove this

Using continuation-passing style, we can convert any data type into a lambda-calculus term. Let's start with a simple type like:

data Either a b = Left a | Right b

In continuation-passing style, this would become:

type IndirectEither a b = forall x . (Either a b -> x) -> x

But Alonzo Church was smart and noticed that for any type with multiple constructors, we can just provide a separate function for each constructor. So in the case of the above type, instead of providing a function of type (Either a b -> x), we can instead provide two separate functions, one for the a, and one for the b, and that would be just as good:

type IndirectEither a b = forall x . (a -> x) -> (b -> x) -> x
-- Exercise: Prove that this definition is isomorphic to the previous one

What about a type like Bool where the constructors have no arguments? Well, Bool is isomorphic to Either () () (Exercise: Prove this), so we can just encode it as:

type IndirectBool = forall x . (() -> x) -> (() -> x) -> x

And () -> x is just isomorphic to x (Exercise: Prove this), so we can further rewrite it as:

type IndirectBool = forall x . x -> x -> x

There are only two functions that can have the above type:

true :: a -> a -> a
true a _ = a

false :: a -> a -> a
false _ a = a

Because of the isomorphism, we can guarantee that all Church encodings will have as many implementations as there were possible values of the original data type. So it's no coincidence that there are exactly two functions that inhabit IndirectBool, just like there are exactly two constructors for Bool.

But how do we pattern-match on our IndirectBool? For example, with an ordinary Bool, we could just write:

expression1 :: a
expression2 :: a

case someBool of
    True  -> expression1
    False -> expression2

Well, with our IndirectBool it already comes with the tools to deconstruct itself. We would just write:

myIndirectBool expression1 expression2

Notice that if myIndirectBool is true, it will pick the first expression, and if it is false it will pick the second expression, just as if we had somehow pattern-matched on its value.

Let's try to do the same thing with an IndirectEither. Using an ordinary Either, we'd write:

f :: a -> c
g :: b -> c

case someEither of
    Left  a -> f a
    Right b -> g b

With the IndirectEither, we'd just write:

someIndirectEither f g

In short, when we write types in continuation-passing style, the continuations are like the case statements of a case construct, so all we are doing is passing each different case statement as arguments to the function.

This is the reason you don't need any sense of Eq to pattern-match on a type. In lambda calculus, the type decides what it is "equal" to, simply by defining which argument it selects out of the ones provided to it.

So if I'm a true, I prove I am "equal" to true by always selecting my first argument. If I'm a false, I prove I am "equal" to false by always selecting my second argument. In short, constructor "equality" boils down to "positional equality", which is always defined in a lambda calculus, and if we can distinguish one parameter as the "first" and another as the "second", that's all we need to have the ability to "compare" constructors.

145

answered Sep 23 '22 08:09

Gabriella Gonzalez

First of all, match and match3 are actually exactly the same (ignoring the different strings): pattern matching in functions is desugared to a case statement.

Next, pattern matching works on constructors rather than arbitrary values. Pattern matching is built into the language and does not depend on boolean expressions--it just acts on the constructors directly. This is most evident with more complex matches that include some matchable terms:

f :: MyType -> Int
f (A a) = a + 1
f (B a b) = a + b

How would you rewrite these patterns into boolean expressions? You can't (without knowing anything else about MyType).

Instead, the pattern matching just goes by constructor. Each pattern has to be headed by a constructor--you can't write a pattern like f (a b c) in Haskell. Then, when the function gets a value, it can look at the value's constructor and jump to the appropriate cases immediately. This is the way it has to work for more complicated patterns (like A a), and is also the way it works for the simple patterns you used.

Since pattern matching just works by going to the appropriate constructor, it does not depend on Eq at all. Not only do you not have to have an Eq instance to pattern match, but having one also does not change how pattern matching behaves. For example, try this:

data MyType = TypeA | TypeB | TypeC

instance Eq MyType where 
  TypeA == TypeA = True
  TypeB == TypeC = True
  TypeC == TypeB = True
  _ == _         = False

match :: MyType → String
match TypeA = "this is type A"
match TypeB = "this is type B"
match TypeC = "this is type C"

match2 :: MyType → String
match2 a | a == TypeA = "this is type A matched by equality"
         | a == TypeC = "this is type B matched by equality"
         | a == TypeB = "this is type C matched by equality"
         | otherwise = "this is neither type A nor type B"

Now you've defined equality such that TypeB is equal to TypeC but not to itself. (In real life, you should ensure that equality behaves reasonably and follows the reflexive property, but this is a toy example.) Now, if you use pattern matching, TypeB still matches TypeB and TypeC matches TypeC. But if you use your guard expressions, TypeB actually matches TypeC and TypeC matches TypeB. TypeA is unchanged between the two.

Moreover, note how the Eq instance was defined using pattern matching. When you use a deriving clause, it gets defined in a similar way with code generated at compile time. So pattern matching is more fundamental than Eq--it is part of the language where Eq is just part of the standard library.

In summary: pattern matching is built into the language and works by comparing the constructor and then recursively matching on the rest of the value. Equality is usually implemented in terms of pattern matching and compares the whole value rather than just the constructor.

answered Sep 21 '22 08:09

Tikhon Jelvis

Related questions
                            
                                Data type for Tree in Haskell
                            
                                Haskell: -fglasgow-exts should one avoid code that requires this?
                            
                                How do I write a constant-space length function in Haskell?
                            
                                haskell foldl with (++)
                            
                                Haskell string to list [duplicate]
                            
                                Problem with "Looping" IO in Haskell
                            
                                avoid explicit passing of lookup table
                            
                                Haskell: Type safety with logically different Boolean values
                            
                                Haskell type error: Could not deduce (Show a) arising from a use of `show' from the context (Num a)
                            
                                Haskell: function composition has just damaged my brain
                            
                                Haskell: Purpose of the flip function?
                            
                                Finding combinations
                            
                                Haskell Hello world, eclipse IDE
                            
                                Performance of "all" in haskell
                            
                                In Haskell, why do i have to use the dollar sign in this code?
                            
                                What happens if you compile a program that takes no input? (Haskell IO purity issues (again))
                            
                                How to map a list of functions over multiple arguments in Haskell?
                            
                                Haskell -- How to split a number into a list for further processing?
                            
                                haskell "invalid type signature error"
                            
                                Haskell - What makes 'main' unique?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With