Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is pattern matching preferred in function definitions?

I am reading the "learnyouahaskell" tutorial from learnyouahaskell. There it reads:

Pattern matching can also be used on tuples. What if we wanted to make a function that takes two vectors in a 2D space (that are in the form of pairs) and adds them together? To add together two vectors, we add their x components separately and then their y components separately. Here's how we would have done it if we didn't know about pattern matching:

addVectors :: (Num a) => (a, a) -> (a, a) -> (a, a)  
addVectors a b = (fst a + fst b, snd a + snd b)  

Well, that works, but there's a better way to do it. Let's modify the function so that it uses pattern matching.

addVectors :: (Num a) => (a, a) -> (a, a) -> (a, a)  
addVectors (x1, y1) (x2, y2) = (x1 + x2, y1 + y2)  

There we go! Much better. Note that this is already a catch-all pattern. The type of addVectors (in both cases) is addVectors :: (Num a) => (a, a) -> (a, a) - > (a, a), so we are guaranteed to get two pairs as parameters.

My question is: Why is the pattern matching preferred the preferred way, if both definitions result in the same signature?

like image 328
Zelphir Kaltstahl Avatar asked Feb 05 '16 19:02

Zelphir Kaltstahl


3 Answers

I think in this case the pattern matching expresses more directly what you mean.

In the function application case, one needs to know what fst and snd do, and from it deduce that a and b are tuples whose elements get added.

addVectors a b = (fst a + fst b, snd a + snd b)

The fact that we have snd and fst functions to decompose tuples is distracting here.

In the pattern matching case it is immediately clear what the input is (a tuple whose elements we call x1 and y1 and a tuple... etc) and how it is deconstructed. And it is also immediately clear what is happening, how their elements are added.

addVectors (x1, y1) (x2, y2) = (x1 + x2, y1 + y2)

This is almost like the mathematical definition:

(x1, y1) + (x2, y2) := (x1 + x2, y1 + y2)

Straight to the point, no distractions :-)

You could literally write this in Haskell:

(x₁, y₁) `addVector` (x₂, y₂) = (x₁ + x₂, y₁ + y₂)
like image 172
wires Avatar answered Oct 16 '22 16:10

wires


In a nutshell, one needs to construct and destruct values.

Values are constructed by taking a data constructor which is a (possibly null-ary) function, and applying the required arguments. So far, so good.

Random example (abusing GADTSyntax)

data T where
  A :: Int -> T
  B :: T
  C :: String -> Bool -> T

Destruction is more complex, since one needs to take a value of type T and obtain information about 1) which constructor was used to craft such value, and 2) what are the arguments to said constructor.

Part 1) could be done through a function:

whichConsT :: T -> Int -- returns 0,1,2 for A,B,C

Part 2) is more tricky. A possible option is to use projections

projA :: T -> Int
-- projB not needed
projC1 :: T -> String
projC2 :: T -> Bool

so that e.g. they satisfy

projA (A n) = n
projC1 (C x y) = x
projC2 (C x y) = y

But wait! The types of the projections are of the form T -> ..., which promises that such functions work on all values of type T. So we can have

projA B = ??
projA (C x y) = ??
projC1 (A n) = ??

How to implement the above? There's no way to produce sensible results, so the best option is to trigger a runtime error.

projA B = error "not an A!"
projA (C x y) = error "not an A!"
projC1 (A n) = error "not a C!"

However, this puts a burden on the programmer! Now it is the programmer's responsibility to check that values which are passed to the projections have the right constructor. This can be done using whichConsT. Many imperative programmers are used to this kind of interface (test & access, e.g. Java's hasNext(), next() in iterators), but this is because most imperative languages have no really better option.

FP languages (and, nowadays, some imperative languages as well) also allow pattern matching. Using it has the following advantages over projections:

  • no need to split the information: we get 1) and 2) at the same time
  • no way to crash the program: we never use partial projection functions which can crash
  • no burden on the programmer: corollary of the above
  • if the exhaustiveness-checker is on, we are sure to handle all the possible cases

Now, on types having exactly one constructor (tuples, (), newtypes), one can define total projections, which are perfectly fine (e.g. fst,snd). Still, many prefer to stick with pattern matching, which can also handle the general case as well.

like image 39
chi Avatar answered Oct 16 '22 17:10

chi


As Carsten mentioned in comments, this is an opinion-based question, but let me elaborate anyway.

Using pattern matching against 2-tuples isn't that much of an advantage, but let's consider some bigger data structure, for example 4-tuples.

addVectors :: (Num a) => (a, a, a, a) -> (a, a, a, a) -> (a, a, a, a)  
addVectors a b = -- some code that adds vectors

addVectors :: (Num a) => (a, a, a, a) -> (a, a, a, a) -> (a, a, a, a)  
addVectors (w1, x1, y1, z1) (w2, x2, y2, z2) = (w1 + w2, x1 + x2, y1 + y2, z1 + z2)

Without pattern matching you'd have to write functions that extract the first, second, third and fourth element from a 4-tuple an use it inside addVectors. With pattern matching, writing the implementation of addVectors is very easy.

I believe using such an example in the book could get the message across more effectively.

like image 35
Kapol Avatar answered Oct 16 '22 17:10

Kapol