In the paper "Plugging a Space Leak with an Arrow" by Liu and Hudak, it is claimed that this leads to O(n^2) runtime behaviour (for computing the nth term):
successors n = n : map (+1) (successors n)
, while this gives us linear time:
successors n = let ns = n : map (+1) ns
in ns
. This statement is certainly correct, as I can verify it easily with GHCi. However, I can't seem to understand why exactly, and how structure sharing helps in this case. I even tried to write out both expansions for computing the third term.
Here is my try for the first variant:
successors 1 !! 2
(1 : (map (+1) (successors 1))) !! 2
(map (+1) (successors 1)) !! 1
(map (+1) (1 : map (+1) (successors 1))) !! 1
2 : (map (+1) (map (+1) (successors 1))) !! 1
(map (+1) (map (+1) (successors 1))) !! 0
(map (+1) (map (+1) (1 : map (+1) (successors 1)))) !! 0
(map (+1) (2 : map (+1) (map (+1) (successors 1)))) !! 0
3 : map (+1) (map (+1) (map (+1) (successors 1))) !! 0
3
and the second:
successors 1 !! 2
(let ns = 1 : map (+1) ns in ns) !! 2
(1 : map (+1) ns) !! 2
map (+1) ns !! 1
map (+1) (1 : map (+1) ns) !! 1
2 : map (+1) (map (+1) ns) !! 1
map (+1) (map (+1) ns) !! 0
map (+1) (map (+1) (1 : map (+1) ns)) !! 0
map (+1) (2 : map (+1) (map (+1) ns)) !! 0
3 : map (+1) (map (+1) (map (+1) ns)) !! 0
3
As you see, my expansions look almost exactly the same and seem to suggest quadratic behaviour for both. Somehow structure sharing sets in in the latter definition and reuses earlier results, but it looks magical. Can anyone elaborate?
Loosely speaking: you are allowed to pretend, in the definition of ns
, that ns
has been evaluated completely. So what we actually get is, essentially,
successors n = let ns = n : map (+1) [n,n+1,n+2,n+3,n+4,...]
You only need to count the cost of this one map
.
Let's think about this operationally.
ns = n : map (+1) ns
What does this do? Well, it allocates a bit of memory to hold ns
, and stores in it a (:)
constructor that points to the value n
and to a "thunk" representing map (+1) ns
. But that thunk represents ns
as a pointer to that very same bit of memory holding ns
! So we actually have a circular structure in memory. When we ask for the second element of ns
, that thunk is forced. This involves accessing ns
, but the portion accessed has already been calculated. It doesn't need to be calculated again. The effect of this forcing is to replace map (+1) ns
with n+1:map (+1) ns'
, where ns'
is a pointer to the (now-known) second element of ns
. So as we proceed, we build a list whose last piece is always a little circular bit.
To understand this, we need the definition of map
map _ [] = []
map f (x:xs) = f x : map f xs
We will compute successors 0
, pretending the spine of the resulting list is being forced as we compute it. We'll start by binding n
to 0
.
successors 0 = let ns = 0 : map (+1) ns
in ns
Everywhere we hold on to the result of a computation - in a (non-strict) field of a constructor or a let
or where
binding, we are actually storing a thunk that will take on the value of the result of the computation when the thunk is evaluated. We can represent this placeholder in code by introducing a new variable name. For the eventual result of map (+1) ns
being placed in the tail of the :
constructor we'll introduce a new variable called ns0
.
successors 0 = let ns = 0 : ns0 where ns0 = map (+1) ns
in ns
Now let's expand
map (+1) ns
Using the definition of map
. We know from the let
binding we just wrote that:
ns = 0 : ns0 where ns0 = map (+1) ns
Therefore
map (+1) (0 : ns0) = 0 + 1 : map (+1) ns0
When the second item is forced we have:
successors 0 = let ns = 0 : ns0 where ns0 = 0 + 1 : map (+1) ns0
in ns
We don't need the ns
variable anymore, so we'll remove it to clean this up.
successors 0 = 0 : ns0 where ns0 = 0 + 1 : map (+1) ns0
We'll introduce new variable names n1
and ns1
for the computations 0 + 1
and map (+1) ns0
, the arguments to the rightmost :
constructor.
successors 0 = 0 : ns0
where
ns0 = n1 : ns1
n1 = 0 + 1
ns1 = map (+1) ns0
We expand map (+1) ns0
.
map (+1) (n1 : ns1) = n1 + 1 : map (+1) ns1
After the third item in the list's spine (but not yet its value) is forced, we have:
successors 0 = 0 : ns0
where
ns0 = n1 : ns1
n1 = 0 + 1
ns1 = n1 + 1 : map (+1) ns1
We don't need the ns0
variable anymore, so we'll remove it to clean this up.
successors 0 = 0 : n1 : ns1
where
n1 = 0 + 1
ns1 = n1 + 1 : map (+1) ns1
We'll introduce new variable names n2
and ns2
for the computations n1 + 1
and map (+1) ns1
, the arguments to the rightmost :
constructor.
successors 0 = 0 : n1 : ns1
where
n1 = 0 + 1
ns1 = n2 : ns2
n2 = n1 + 1
ns2 = map (+1) ns1
If we repeat the steps from the previous section again we have
successors 0 = 0 : n1 : n2 : ns2
where
n1 = 0 + 1
n2 = n1 + 1
ns2 = n3 : ns3
n3 = n2 + 1
ns3 = map (+1) ns2
This is clearly growing linearly in the spine of the list and linearly in the thunks to calculate the values held in the list. As dfeuer describes, we are only ever dealing with the "little circular bit" at the end of the list.
If we force any of the values held in the list, all of the remaining thunks that refer to it will now refer to the already computed value. For example, if we force n2 = n1 + 1
it will force n1 = 0 + 1 = 1
, and n2 = 1 + 1 = 2
. The list will look like
successors 0 = 0 : n1 : n2 : ns2
where
n1 = 1 -- just forced
n2 = 2 -- just forced
ns2 = n3 : ns3
n3 = n2 + 1
ns3 = map (+1) ns2
And we've only done two additions. The additions to count up to 2 will never be done again because the result of the computation is shared. We can (for free) replace all of the n1
s and n2
s with the just computed values, and forget about those variable names.
successors 0 = 0 : 1 : 2 : ns2
where
ns2 = n3 : ns3
n3 = 2 + 1 -- n3 will reuse n2
ns3 = map (+1) ns2
When n3
is forced it will use the result of n2
that is already known (2
), and those first two additions will never be done again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With