Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When does GHC use sharing?

A beginner's question. I am a bit confused about the following: I read that GHC's (Haskell's?) lazy evaluation method includes the use of sharing, so for example evaluating the expression

(1+1)*(1+1)

will only compute 1+1 once. In the book "Programming in Haskell" by Graham Hutton, it is explained that this is realized by replacing both occurences of 1+1 by a pointer to one copy of it and evaluating just that one copy.

But computing the nth Fibonacci number by fib n = if (n<2) then n else fib (n-1) + fib(n-2) takes exponential time in n. My undestanding of the above tells me that for example fib 5 will be evaluated like this:

fib 5 => fib 4 + fib 3 => fib 3 + fib 2 + fib 3 => x + fib 2 + x

with x = fib 3 = fib 2 + fib 1 = y + fib 1

so fib 5 = y + fib 1 + y + y + fib 1

with y = fib 2 = fib 1 + fib 0 = 1

so fib 5 = 1 + 1 + 1 + 1 + 1

where x and y were introduced as shared values.

But this way of proceeding, while somewhat less efficient than the standard way of iteratively generating fib k for 2<=k<=n, clearly would not result in such a long running time as evaluating the function in fact does. So what's wrong with my understanding?

like image 542
user35359 Avatar asked Nov 14 '12 16:11

user35359


1 Answers

You cannot rely on common subexpression elimination in GHC (for technical reasons, as sharing lazy values can introduce space leaks).

A good rule of thumb is that values are shared by name. So name your subexpression, like so:

 n * n
    where n = 1 + 1

and you guarantee sharing.

Note that in your simple example, GHC would just compute the whole thing at compile time. The above discussion really only applies to "big" data.

You can observe the sharing using debugging tools like vacuum. Such shared values are represented as heap allocated objects with multiple references:

enter image description here

like image 196
Don Stewart Avatar answered Sep 21 '22 00:09

Don Stewart