A beginner's question. I am a bit confused about the following: I read that GHC's (Haskell's?) lazy evaluation method includes the use of sharing, so for example evaluating the expression
(1+1)*(1+1)
will only compute 1+1
once. In the book "Programming in Haskell" by Graham Hutton, it is explained that this is realized by replacing both occurences of 1+1
by a pointer to one copy of it and evaluating just that one copy.
But computing the nth Fibonacci number by fib n = if (n<2) then n else fib (n-1) + fib(n-2)
takes exponential time in n. My undestanding of the above tells me that for example fib 5
will be evaluated like this:
fib 5 => fib 4 + fib 3 => fib 3 + fib 2 + fib 3 => x + fib 2 + x
with x = fib 3 = fib 2 + fib 1 = y + fib 1
so fib 5 = y + fib 1 + y + y + fib 1
with y = fib 2 = fib 1 + fib 0 = 1
so fib 5 = 1 + 1 + 1 + 1 + 1
where x
and y
were introduced as shared values.
But this way of proceeding, while somewhat less efficient than the standard way of iteratively generating fib k
for 2<=k<=n
, clearly would not result in such a long running time as evaluating the function in fact does. So what's wrong with my understanding?
You cannot rely on common subexpression elimination in GHC (for technical reasons, as sharing lazy values can introduce space leaks).
A good rule of thumb is that values are shared by name. So name your subexpression, like so:
n * n
where n = 1 + 1
and you guarantee sharing.
Note that in your simple example, GHC would just compute the whole thing at compile time. The above discussion really only applies to "big" data.
You can observe the sharing using debugging tools like vacuum. Such shared values are represented as heap allocated objects with multiple references:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With