Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell length + map explanation?

Tags:

haskell

I'm playing around with Haskell since I'm learning the language, and I just found something I don't understand and I can't find an explanation. If I try to run this code:

map (`div` 0) [1,2,3,4]

I get a divide by 0 exception, which is expected. But if I run this code:

length (map (`div` 0) [1,2,3,4])

I get 4!

I'd like to know why I don't get the divide by 0 exception when I do the mapping inside the length function!

like image 810
DemianArdus Avatar asked May 06 '14 00:05

DemianArdus


People also ask

What does length do in Haskell?

length returns the length of a finite list as an Int. It is an instance of the more general genericLength, the result type of which may be any kind of number. O(1) length returns the length of a ByteString as an Int.

How does map work in Haskell?

How Map works in Haskell? Working of map in Haskell is as follows: Whenever we want to apply a function on each element of a given list and produce a new list consisting of the updated elements, then we make use of a function called map() function in Haskell.


2 Answers

The map and length functions can be defined this way:

map :: (a -> b) -> [a] -> [b]
map _ [] = []
map f (x:xs) = f x : map f xs

length :: [a] -> Int
length [] = 0
length (_:xs) = 1 + length xs

Now, let's work out by hand why your second example works the way it does. It goes like this:

length (map (`div` 0) (1:2:3:4:[]))
    = length (1 `div` 0 : map (`div` 0) (2:3:4:[]))     -- second map equation
    = 1 + (length (map (`div` 0) (2:3:4:[])))           -- second length equation
    = 1 + (length (2 `div` 0 : map (`div` 0) (3:4:[]))) -- second map equation
    .
    .
    .
    = 1 + (1 + (1 + (1 + length (map (`div` 0) [])))))  -- second length equation
    = 1 + (1 + (1 + (1 + length []))))                  -- first map equation
    = 1 + (1 + (1 + (1 + 0))))                          -- first length equation
    = 4                                                 -- arithmetic

What's the trick here? In Haskell the process of evaluating an expression is called forcing it. Forcing an expression performs the minimum work necessary in order to figure out the outermost data constructor of the result. As part of this, subexpressions will be forced only as necessary to achieve the goal.

So in this example, the outermost expression we're forcing is an application of the length function. The definition of length has two cases, one which uses the [] list constructor and the other which uses the (:) constructor, so to apply length we need to figure out which of these two cases to apply to the argument. Since the argument doesn't have either constructor in its outermost position, we have to force it to find out. That's what happens in the step between the first and the second line of the derivation above; we force the map subexpression by looking at its arguments and choosing the second map equation.

But after that point, we have all the information we need to determine which of the two equations for length applies, so we go by the "outermost first" rule and apply the appropriate length equation. In this case, this discards the subexpression that contains the division by zero, which means that subexpression will never be forced, and the error will never be triggered.

In your first example, however, when you type the expression into GHCI, you're implicitly asking the interpreter to print its result. This requires it to force the spine of the list to access its elements, and force the elements themselves to print them. So the division by zero error happens when you force the first element of the list.


EDIT: Let me point out one nuance you may have not noticed. When we try your first example in GHCI, this is the result of the session:

*Main> map (`div` 0) [1,2,3,4]
[*** Exception: divide by zero

See that lonely opening square bracket at the beginning of the second line? That's the opening bracket for the list that was being printed, before the divide by zero error happened. Likewise, notice what happens in this example:

*Main> map (20 `div`) [1,2,0,4]
[20,10,*** Exception: divide by zero

The first two elements of the result list, and even the comma separating the second element from the third, print successfully because Haskell doesn't attempt to compute the third list element until it needs to be printed.

like image 139
Luis Casillas Avatar answered Oct 31 '22 15:10

Luis Casillas


If you type the map expression into the interpreter, it will evaluate it and then print the resulting list. In order to do that all elements of the resulting list need to be evaluated because they will be part of the string that is displayed.

However when the interpreter evaluates the length expression, it only needs to look at the structure of the resulting list. It does not have to look at the actual elements inside the list. So since Haskell, being a lazy language, only evaluates what it has to, that means that the elements will not be evaluated and thus no exception is thrown.

like image 32
sepp2k Avatar answered Oct 31 '22 17:10

sepp2k