In Haskell I can write a self referential sequence, in GHCI, like so:
λ> let x = 1:map (+1) x
λ> take 5 x
which produces:
[1,2,3,4,5]
However my intuition on lazy evaluation says this should happen during expansion
let x = 1:map (+1) x
1:2:map (+1) x
1:2:map (+1) [1, 2] <-- substitution
1:2:2:3:map (+1) x
1:2:2:3:map (+1) [1, 2, 2, 3] <-- substitution
1:2:2:3:2:3:3:4:map (+1) x
...
This is obviously not what's happening. I can see the pattern in the correct answer. We are merely moving one element in the list at a time down an infinite stream. The pattern I recognize and I can apply it in code. However it does not line up with my mental model of lazy evaluation. It feels a bit "magic". Where is my intuition wrong?
Remember only to substitute something with its definition. So whenever you expand x
, you should substitute 1 : map (+1) x
, not its "current value" (whatever that means).
I'll reproduce Jefffrey's idea, but with due respect for laziness.
x = 1 : map (+1) x
take 5 x
= take 5 (1 : map (+1) x) -- x
= 1 : take 4 (map (+1) x) -- take
= 1 : take 4 (map (+1) (1 : map (+1) x) -- x
= 1 : take 4 (2 : map (+1) (map (+1) x)) -- map and (+)
= 1 : 2 : take 3 (map (+1) (map (+1) x)) -- take
= 1 : 2 : take 3 (map (+1) (map (+1) (1 : map (+1) x))) -- x
= 1 : 2 : take 3 (map (+1) (2 : map (+1) (map (+1) x))) -- map and (+)
= 1 : 2 : take 3 (3 : map (+1) (map (+1) (map (+1) x))) -- map and (+)
= 1 : 2 : 3 : take 2 (map (+1) (map (+1) (map (+1) x))) -- take
and so on.
Exercise finish the evaluation in this style yourself (it's quite informative).
Notice how we are starting to build up a chain of map
s as the list grows. If you just print x
, you will see the output start to slow down after a while; this is why. There is a more efficient way, left as an exercise (and [1..]
is cheating :-).
N.B. this is still a little less lazy than what will actually happen. map (+1) (1 : ...)
evaluates to (1+1) : map (+1) ...
, and the addition will only happen when the number is actually observed, by either printing it or e.g. comparing it.
Will Ness identified an error in this post; see the comments and his answer.
Here is what happens. Laziness is non-strictness + memoization (of thunks). We can show this by naming all the interim data that come into existence as the expression is forced:
λ> let x = 1 : map (+1) x
>>> x = a1 : x1 -- naming the subexpressions
a1 = 1
x1 = map (+1) x
λ> take 5 x
==> take 5 (a1:x1) -- definition of x
==> a1:take 4 x1 -- definition of take
>>> x1 = map (1+) (1:x1) -- definition of x
= (1+) 1 : map (1+) x1 -- definition of map
= a2 : x2 -- naming the subexpressions
a2 = (1+) 1
x2 = map (1+) x1
==> a1:take 4 (a2:x2) -- definition of x1
==> a1:a2:take 3 x2 -- definition of take
>>> x2 = map (1+) (a2:x2) -- definition of x1
= (1+) a2 : map (1+) x2 -- definition of map
= a3 : x3 -- naming the subexpressions
a3 = (1+) a2
x3 = map (1+) x2
==> a1:a2:take 3 (a3:x3) -- definition of x2
==> a1:a2:a3:take 2 x3 -- definition of take
>>> x3 = map (1+) (a3:x3) -- definition of x2
.....
The elements in the resulting stream a1:a2:a3:a4:...
each refer to its predecessor: a1 = 1; a2 = (1+) a1; a3 = (1+) a2; a4 = (1+) a3; ...
.
Thus it is equivalent to x = iterate (1+) 1
. Without the sharing of data and its reuse through back-reference (enabled by the memoization of storage), it would be equivalent to x = [sum $ replicate n 1 | n <- [1..]]
which is a radically less efficient computation (O(n2) instead of O(n)).
We can explicate the sharing vs non-sharing with
fix g = x where x = g x -- sharing fixpoint
x = fix ((1:) . map (1+)) -- corecursive definition
_Y g = g (_Y g) -- non-sharing fixpoint
y = _Y ((1:) . map (1+)) -- recursive definition
Trying to print out y
at GHCi's prompt shows a marked slowdown as the progression goes along. There's no slowdown when printing out the x
stream.
(see also https://stackoverflow.com/a/20978114/849891 for a similar example).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With