Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Haskell's laziness work?

Tags:

haskell

Consider this function that doubles all the elements in a list:

doubleMe [] = [] doubleMe (x:xs) = (2*x):(doubleMe xs) 

Then consider the expression

doubleMe (doubleMe [a,b,c]) 

It seems obvious that, at runtime, this first expands to:

doubleMe ( (2*a):(doubleMe [b,c]) ) 

(It's obvious because no other possibilities exist as far as I can see).

But my question is this: Why exactly does this now expand to

2*(2*a) : doubleMe( doubleMe [b,c] ) 

instead of

doubleMe( (2*a):( (2*b) : doubleMe [c] ) ) 

?

Intuitively, I know the answer: Because Haskell is lazy. But can someone give me a more precise answer?

Is there something special about lists that causes this, or is the idea more general than that just lists?

like image 809
stackoverflowuser Avatar asked Feb 03 '15 15:02

stackoverflowuser


People also ask

How does lazy evaluation work?

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing).

Why is Haskell lazy?

Haskell is a lazy language. It does not evaluate expressions until it absolutely must. This frequently allows our programs to save time by avoiding unnecessary computation, but they are at more of a risk to leak memory. There are ways of introducing strictness into our programs when we don't want lazy evaluation.

Why is lazy evaluation good?

"Lazy" evaluation is performing operations when and as they are needed. It is useful when it is a feature of a programming language or library because it is generally harder to implement lazy evaluation on your own than simply to precalculate everything up front.

Why lazy evaluation is important in spark?

Lazy evaluation means that Spark does not evaluate each transformation as they arrive, but instead queues them together and evaluate all at once, as an Action is called. The benefit of this approach is that Spark can make optimization decisions after it had a chance to look at the DAG in entirety.


2 Answers

doubleMe (doubleMe [a,b,c]) does not expand to doubleMe ( (2*a):(doubleMe [b,c]) ). It expands to:

case doubleMe [a,b,c] of   [] -> []   (x:xs) -> (2*x):(doubleMe xs) 

That is the outer function call is expanded first. That's the main difference between a lazy language and a strict one: When expanding a function call, you don't first evaluate the argument - instead you replace the function call with its body and leave the argument as-is for now.

Now the doubleMe needs to be expanded because the pattern matching needs to know the structure of its operand before it can be evaluated, so we get:

case (2*a):(doubleMe [b,c]) of   [] -> []   (x:xs) -> (2*x):(doubleMe xs) 

Now the pattern matching can be replaced with the body of the second branch because we now know that the second branch is the one that matches. So we substitute (2*a) for x and doubleMe [b, c] for xs, giving us:

(2*(2*a)):(doubleMe (doubleMe [b,c])) 

So that's how we arrive at that result.

like image 174
sepp2k Avatar answered Oct 03 '22 01:10

sepp2k


Your “obvious” first step isn't actually quite so obvious. In fact what happens is rather like this:

doubleMe (...) doubleMe ( { [] | (_:_) }? ) doubleMe ( doubleMe (...)! ) 

and only at that point does it actually “enter” the inner function. So it proceeds

doubleMe ( doubleMe (...) ) doubleMe ( doubleMe( { [] | (_:_) }? ) ) doubleMe ( doubleMe( a:_ ! ) ) doubleMe ( (2*a) : doubleMe(_) ) doubleMe ( (2*a):_ ! ) 

now here the outer doubleMe function has the “answer” to it's [] | (_:_) question, which was the only reason anything in the inner function was evaluated at all.

Actually, the next step is also not necessarily what you though: it depends on how you evaluate the outer result! For instance, if the whole expression was tail $ doubleMe ( doubleMe [a,b,c] ), then it would actually expand more like

tail( { [] | (_:_) }? ) tail( doubleMe(...)! ) tail( doubleMe ( { [] | (_:_) }? ) ) ... tail( doubleMe ( doubleMe( a:_ ! ) ) ) tail( doubleMe ( _:_ ) ) tail( _ : doubleMe ( _ ) ) doubleMe ( ... ) 

i.e. it would in fact never really get to 2*a!

like image 20
leftaroundabout Avatar answered Oct 03 '22 00:10

leftaroundabout