Thanks to lazy evaluation, Haskell can process following expression in a moment.
head.head $ let m = 1000000000000000 in map (\n -> [m*n .. m*(n+1)]) [1 .. m]
But I noticed following expression doesn't finish, yet the memory usage stayed low.
last.last $ let m = 1000000000000000 in map (\n -> [m*n .. m*(n+1)]) [1 .. m]
It is not suprising that haskell can do that. But I wonder how it works. More precisely, how haskell allocate memory. Is there any way to check memory layout or something like that?
Let's simplify this example a bit to see what happens. You can mostly figure it out just thinking about lazy evaluation and graph reduction, without needing to go any lower-level than that. Let's look at a simplified reduction of ourLast (mkList 3)
with this code:
ourLast :: [a] -> a
ourLast [] = error "ourLast []"
ourLast (x:[]) = x
ourLast (_:xs) = ourLast xs
mkList :: Int -> [Int]
mkList 0 = []
mkList n = let rest = mkList (n-1) in n : rest
?foo
means "a value we haven't looked at yet". We create these with "let".
foo@bar
means "a value we've already computed which we've figured out is bar
"
when we examine ?foo
, it becomes foo@bar
foo := bar
means "we've not figured out that foo
is bar
"
-- We start here by creating some value and giving it to ourLast to
-- look at.
let ?list3 = mkList 3
ourLast ?list3
-- At this point ourLast examines its argument to figure out whether
-- it's of the form (_:_) or []
-- mkList sees that 3 /= 0, so it can pick the second case, and it
-- computes the value for ?list3.
-- (I'll skip the arithmetic reductions and other boring things.)
let ?list2 = mkList 2
list3 := 3 : ?list2 -- we don't need to compute ?list2 yet, so
-- (mkList 3) is done and we can go back to ourLast
ourLast list3@(3 : ?list2)
-- ourLast needs to examine ?list2 to find out whether it's [] or not,
-- so mkList does the same thing again
let ?list1 = mkList 1
list2 := 2 : ?list1
ourLast list3@(3 : list2@(2 : ?list1))
-- Now ourLast has enough information to continue;
-- ourLast (_ : xs@(_ : _)) = ourLast xs
-- Notice how we don't need to compute list2 a second time; we save its
-- value the first time we compute it. This is what lazy evaluation is.
ourLast list2@(2 : ?list1)
-- at this point, there are no references to `list3` anywhere, so it
-- can be GCed.
-- Repeat (ourLast examines ?list1, mkList sees that 1 /= 0).
let ?list0 = mkList 0
list1 := 1 : ?list0
ourLast list2@(2 : list1@(1 : ?list0))
ourLast list1@(1 : ?list0)
-- Can GC list2.
-- Now mkList is being called with 0, so it just returns an empty list.
list0 := []
ourLast list1@(1 : list0@[])
1
-- We're done! (And we can GC list1.)
Notice how at any given time we only need to have a couple thunks actually allocated, and the rest either aren't computed yet or can be GCed. When we evaluate ourLast list3
, evaluation jumps back and forth between ourLast
and mkList
(sort of like coroutines).
If you want to get a more precise idea of how Haskell compilers tend to work, on the level of "when and how does allocation happend", the following are helpful:
A general understanding how lazy evaluation works just from the perspective of graph reduction -- e.g. this article -- is useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With