I'm currently reading Programming in Haskell, by Graham Hutton. In p.40, a toy primality test is presented: <pre class="prettyprint"><code>factors :: Int -> [Int] factors n = [x | x <- [1..n], n `mod` x == 0] prime :: Int -> Bool prime n = factors n == [1,n] </code></pre> The author then goes on to explain how <blockquote> "deciding that a number is not prime does not require the function prime to produce all of its factors, because under lazy evaluation the result <code>False</code> is returned as soon as any factor other than one or the number itself is produced" </blockquote> As someone coming from C and Java I find this shocking. I'd have expected the <code>factors</code> call to complete first, save the result in the stack and pass the control to the calling function. But apparently here a very different program is being executed: there must be a loop over the list comprehension in <code>factors</code> and the equality check in <code>prime</code> is being checked for each new element added to the list of factors. How is this possible? Doesn't this make way more difficult to reason about the order of execution of a program?

You find it "shocking" because you're not expecting it. Once you get used to it... OK, actually it still trips people over. But after a while, you eventually wrap your mind around it. How Haskell works is this: When you call a function, nothing happens! The call is noted down somewhere, and that is all. This takes approximately no time at all. Your "result" is actually just an "I owe you" telling the computer what code to run to get to the result. Not the entire result, mind you; just the first step of it. For something like an integer, there is only one step. But for a list, each element is a separate step. Let me show you a simpler example: <pre class="prettyprint"><code>print (take 10 ([1..] ++ [0])) </code></pre> I spoke with a C++ programmer who was "shocked" that this works. Surely the "<code>++[0]</code>" part has to "find the end of the list" before it can append zero to it? How can this code complete in finite time?! It looks like this builds <code>[1..]</code> (in infinite list), and then <code>++[0]</code> scans to the end of this list and inserts a zero, and then <code>take 10</code> trims off just the first 10 elements, and then it prints. That would, of course, take infinite time. So here's what actually happens. The outer-most function is <code>take</code>, so that's where we start. (Weren't expecting that, eh?) The definition of <code>take</code> is something like this: <pre class="prettyprint"><code>take 0 ( _) = [] take n ( []) = [] take n (x:xs) = x : (take (n-1) xs) </code></pre> So clearly 10 != 0, so the first line does not apply. So either the second or third line applies. So now <code>take</code> looks at <code>[1..] ++ [0]</code> to see if it's an empty list, or a non-empty list. The outer-most function here is <code>(++)</code>. It's definition looks similar to <pre class="prettyprint"><code>( []) ++ ys = ys (x:xs) ++ ys = x : (xs ++ ys) </code></pre> So we need to figure out which equation applies. Either the left argument is an empty list (line one applies), or it isn't (line two applies). Well, since <code>[1..]</code> is an infinite list, line two always applies. So the "result" of <code>[1..] ++ [0]</code> is <code>1 : ([2..] ++ [0])</code>. As you can see, this isn't completely executed; but it's executed far enough to tell this is a non-empty list. Which is all that <code>take</code> cares about. <pre class="prettyprint"><code>take 10 ([1..] ++ [0]) take 10 (1 : ([2..] ++ [0])) 1 : take 9 ([2..] ++ [0]) 1 : take 9 (2 : ([3..] ++ [0])) 1 : 2 : take 8 ([3..] ++ [0]) 1 : 2 : take 8 (3 : ([4..] ++ [0])) 1 : 2 : 3 : take 7 ([4..] ++ [0]) ... 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10 : take 0 ([11..] ++ [0]) 1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10 : [] </code></pre> Do you see how this unwinds? <hr> Now, returning to your specific question: the <code>(==)</code> operator takes a pair of lists and iterates over both of them, comparing them element by element to ensure they are equal. As soon as a difference is found, it immediately aborts and returns false: <pre class="prettyprint"><code>( []) == ( []) = True (x:xs) == (y:ys) = (x == y) && (xs == ys) ( _) == ( _) = False </code></pre> If we now try, say, <code>prime 6</code>: <pre class="prettyprint"><code>prime 6 factors 6 == [1,6] ??? == [1,6] 1 : ??? == [1,6] ??? == [6] 2 : ??? == [6] False </code></pre>

I'll focus on this point: <blockquote> Doesn't this make way more difficult to reason about the order of execution of a program? </blockquote> Yes, but the order of evaluation does not matter so much in pure functional programming. For instance: <pre class="prettyprint"><code>(1 * 3) + (4 * 5) </code></pre> Q: which multiplication is performed first? A: we don't care, the result is the same. Even C compilers can choose any order here. <pre class="prettyprint"><code>(f 1) + (f 2) </code></pre> Q: which function call is performed first? A: we don't care, the result is the same. Here, C compilers can choose any order as well. However, in C, function <code>f</code> may have side effects, making the result of the sum above depend on the order of evaluation. In pure functional programming, there are no side effects, so we really don't care. Also, laziness allows semantics-preserving expansion of any function definition. Suppose we define <pre class="prettyprint"><code>f x = e -- e is an expression which can use x </code></pre> and we call <code>f 2</code>. The result should be the same as <code>e{2/x}</code>, i.e. as <code>e</code> where every (free) occurrence of <code>x</code> has been replaced by <code>2</code>. This is just "unfolding the definition", as in maths. For instance, <pre class="prettyprint"><code>f x = x + 4 -- f 2 ==> 2 + 4 ==> 6 </code></pre> However, suppose we call <code>f (g 2)</code> instead. Laziness makes this equivalent to <code>e{g 2/x}</code>. Again, as in maths. For example: <pre class="prettyprint"><code>f x = 42 g n = g (n + 1) -- infinite recursion </code></pre> then we still have <code>f (g 2) = 42 {g 2/x} = 42</code> since <code>x</code> is not used. We do not have to worry if <code>g 2</code> is defined or not (looping forever). Definition unfolding always works. This actually makes it simpler to reason about the program behaviour. There are some downsides of laziness, though. A main one is that, while the semantics of the program is (arguably) simpler, estimating the performance of the program is harder. To assess the performance, you have to understand more than what will be the final result: you need to have a model of all the intermediate steps leading to that result. Especially in high level code, or when some clever optimization kicks in, this requires some expertise on how the runtime actually works.

<blockquote> Doesn't this make way more difficult to reason about the order of execution of a program? </blockquote> Probably - at least, for those coming from a procedural / OO paradigm. I have done a lot with iterators and functional programming in other, eager-evaluation languages, and for me the lazy evaluation strategy is not the major problem learning Haskell. (How many times have you wished that your Java log statement would not even get the data for the message until after deciding to actually log it?) Think of all list processing in Haskell as if it was compiled into an iterator based implementation under the covers. If you were doing this in Java with the possible factors of <code>n</code> given as an <code>Iterator<Integer></code>, wouldn't you want to stop as soon as you found one that wasn't <code>1</code> or <code>n</code>? And if so, it doesn't really matter if the iterator is infinite or not! When you get down to it, the order of execution doesn't matter a whole lot. What you really care about is: <ul> <li>Correctness of the result</li> <li>Timely termination</li> <li>The relative order of any side effects </li> </ul> Now if you have a "purely functional" program, there are no side effects. But when does that happen? Pretty much anything useful beyond straight number/string crunching and metacode (ie higher-order functions) will have side effects. Fortunately (or unfortunately, depending on who you ask), we have the monad as a design pattern in Haskell that serves the purpose of (among other things) controlling the order of evaluation, and therefore of side effects. But even without learning about monads and all that stuff, it actually is about as easy to reason about order of execution as in procedural languages. You just have to get used to it.

Lazy evaluation for list generating functions?

Tags:

haskell

lazy-evaluation

I'm currently reading Programming in Haskell, by Graham Hutton.

In p.40, a toy primality test is presented:

factors :: Int -> [Int]
factors n = [x | x <- [1..n], n `mod` x == 0]

prime :: Int -> Bool
prime n = factors n == [1,n]

The author then goes on to explain how

"deciding that a number is not prime does not require the function prime to produce all of its factors, because under lazy evaluation the result False is returned as soon as any factor other than one or the number itself is produced"

As someone coming from C and Java I find this shocking. I'd have expected the factors call to complete first, save the result in the stack and pass the control to the calling function. But apparently here a very different program is being executed: there must be a loop over the list comprehension in factors and the equality check in prime is being checked for each new element added to the list of factors.

How is this possible? Doesn't this make way more difficult to reason about the order of execution of a program?

461

asked Apr 21 '16 13:04

Mister Smith

3 Answers

You find it "shocking" because you're not expecting it. Once you get used to it... OK, actually it still trips people over. But after a while, you eventually wrap your mind around it.

How Haskell works is this: When you call a function, nothing happens! The call is noted down somewhere, and that is all. This takes approximately no time at all. Your "result" is actually just an "I owe you" telling the computer what code to run to get to the result. Not the entire result, mind you; just the first step of it. For something like an integer, there is only one step. But for a list, each element is a separate step.

Let me show you a simpler example:

print (take 10 ([1..] ++ [0]))

I spoke with a C++ programmer who was "shocked" that this works. Surely the "++[0]" part has to "find the end of the list" before it can append zero to it? How can this code complete in finite time?!

It looks like this builds [1..] (in infinite list), and then ++[0] scans to the end of this list and inserts a zero, and then take 10 trims off just the first 10 elements, and then it prints. That would, of course, take infinite time.

So here's what actually happens. The outer-most function is take, so that's where we start. (Weren't expecting that, eh?) The definition of take is something like this:

take 0 (   _) = []
take n (  []) = []
take n (x:xs) = x : (take (n-1) xs)

So clearly 10 != 0, so the first line does not apply. So either the second or third line applies. So now take looks at [1..] ++ [0] to see if it's an empty list, or a non-empty list.

The outer-most function here is (++). It's definition looks similar to

(  []) ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)

So we need to figure out which equation applies. Either the left argument is an empty list (line one applies), or it isn't (line two applies). Well, since [1..] is an infinite list, line two always applies. So the "result" of [1..] ++ [0] is 1 : ([2..] ++ [0]). As you can see, this isn't completely executed; but it's executed far enough to tell this is a non-empty list. Which is all that take cares about.

take 10 ([1..] ++ [0])
take 10 (1 : ([2..] ++ [0]))
1 : take 9 ([2..] ++ [0])
1 : take 9 (2 : ([3..] ++ [0]))
1 : 2 : take 8 ([3..] ++ [0])
1 : 2 : take 8 (3 : ([4..] ++ [0]))
1 : 2 : 3 : take 7 ([4..] ++ [0])
...
1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10 : take 0 ([11..] ++ [0])
1 : 2 : 3 : 4 : 5 : 6 : 7 : 8 : 9 : 10 : []

Do you see how this unwinds?

Now, returning to your specific question: the (==) operator takes a pair of lists and iterates over both of them, comparing them element by element to ensure they are equal. As soon as a difference is found, it immediately aborts and returns false:

(  []) == (  []) = True
(x:xs) == (y:ys) = (x == y) && (xs == ys)
(   _) == (   _) = False

If we now try, say, prime 6:

prime 6
factors 6 == [1,6]
??? == [1,6]
1 : ??? == [1,6]
??? == [6]
2 : ??? == [6]
False

answered Oct 17 '22 15:10

MathematicalOrchid

I'll focus on this point:

Doesn't this make way more difficult to reason about the order of execution of a program?

Yes, but the order of evaluation does not matter so much in pure functional programming. For instance:

(1 * 3) + (4 * 5)

Q: which multiplication is performed first? A: we don't care, the result is the same. Even C compilers can choose any order here.

(f 1) + (f 2)

Q: which function call is performed first? A: we don't care, the result is the same. Here, C compilers can choose any order as well. However, in C, function f may have side effects, making the result of the sum above depend on the order of evaluation. In pure functional programming, there are no side effects, so we really don't care.

Also, laziness allows semantics-preserving expansion of any function definition. Suppose we define

f x = e -- e is an expression which can use x

and we call f 2. The result should be the same as e{2/x}, i.e. as e where every (free) occurrence of x has been replaced by 2. This is just "unfolding the definition", as in maths. For instance,

f x = x + 4

-- f 2 ==> 2 + 4 ==> 6

However, suppose we call f (g 2) instead. Laziness makes this equivalent to e{g 2/x}. Again, as in maths. For example:

f x = 42
g n = g (n + 1)  -- infinite recursion

then we still have f (g 2) = 42 {g 2/x} = 42 since x is not used. We do not have to worry if g 2 is defined or not (looping forever). Definition unfolding always works.

This actually makes it simpler to reason about the program behaviour.

There are some downsides of laziness, though. A main one is that, while the semantics of the program is (arguably) simpler, estimating the performance of the program is harder. To assess the performance, you have to understand more than what will be the final result: you need to have a model of all the intermediate steps leading to that result. Especially in high level code, or when some clever optimization kicks in, this requires some expertise on how the runtime actually works.

answered Oct 17 '22 16:10

chi

Doesn't this make way more difficult to reason about the order of execution of a program?

Probably - at least, for those coming from a procedural / OO paradigm. I have done a lot with iterators and functional programming in other, eager-evaluation languages, and for me the lazy evaluation strategy is not the major problem learning Haskell. (How many times have you wished that your Java log statement would not even get the data for the message until after deciding to actually log it?)

Think of all list processing in Haskell as if it was compiled into an iterator based implementation under the covers. If you were doing this in Java with the possible factors of n given as an Iterator<Integer>, wouldn't you want to stop as soon as you found one that wasn't 1 or n? And if so, it doesn't really matter if the iterator is infinite or not!

When you get down to it, the order of execution doesn't matter a whole lot. What you really care about is:

Correctness of the result
Timely termination
The relative order of any side effects

Now if you have a "purely functional" program, there are no side effects. But when does that happen? Pretty much anything useful beyond straight number/string crunching and metacode (ie higher-order functions) will have side effects.

Fortunately (or unfortunately, depending on who you ask), we have the monad as a design pattern in Haskell that serves the purpose of (among other things) controlling the order of evaluation, and therefore of side effects.

But even without learning about monads and all that stuff, it actually is about as easy to reason about order of execution as in procedural languages. You just have to get used to it.

answered Oct 17 '22 17:10

wberry

Related questions
                            
                                Haskell as REST server
                            
                                Why the difference in type-inference over the as-pattern in two similar function definitions?
                            
                                Why doesn't TypeSynonymInstances allow partially applied type synonyms to be used in instance heads?
                            
                                Using Haskell to output a UTF-8-encoded ByteString
                            
                                What is the type of this self-applying factorial function?
                            
                                Is it possible to directly invoke Haskell code from Bash and output to stdout?
                            
                                27 different Bool to Bool values in Haskell
                            
                                See which Typeclasses the Type is an instance of in ghci?
                            
                                How to prevent common sub-expression elimination (CSE) with GHC
                            
                                Can someone explain to me why the app function of ArrowApply makes them as powerful as monads?
                            
                                Implementation of null function
                            
                                Understanding Haskell's Bool Deriving an Ord
                            
                                What are some compelling use cases of infinite data structures?
                            
                                What causes "irrefutable pattern failed for pattern" and what does it mean?
                            
                                Must mplus always be associative? Haskell wiki vs. Oleg Kiselyov
                            
                                why write type declarations in Haskell?
                            
                                Haskell/GHC UndecidableInstances - example for non-terminating type check?
                            
                                Where do values fit in Category of Hask?
                            
                                What does the /= operator in Haskell mean?
                            
                                Could not find module `Data.Map' -- It is a member of the hidden package

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With