Consider this piece of code:
incState state i = map (+ i) state
main =
sequence_ $ do
n <- map last $ scanl incState (replicate 1000000 0) (repeat 1)
return $ print $ n
When I run it, memory usage keeps increasing without bounds, as it prints out numbers. Where is the memory leak here? It should only be keeping around a constant amount of memory, previous values of state
should be discarded whenever it prints something.
Even if sequence_
or the list monad is holding onto values of n
, that is only one integer per iteration. I'm seeing 6Gs of memory being used after it only counts up to 100. A million integers should only be taking up 10Ms at most.
Nothing is forcing the intermediate calculations through the middle of the list. If you replace last
with sum
it stops leaking memory.
To see what's going on, let's consider a slightly simpler statement of the same thing:
main = mapM (print . last) $ scanl incState (replicate 5 0) (repeat 1)
Now, let's start evaluating it. To begin with, the memory for our state just looks like
scanl incState (map (+1) (replicate 5 0)) (repeat 1)
We'll skip some of the details, but the first demand we have is that mapM
pattern matches against the start of the list of states.
(:) -> replicate 5 0
| ^
| |_________________
v |
scanl incState (map (+1) | ) (repeat 1)
To get the last
item to print
, we force the spine of the list and the last item, but not the individual items. We'll see this better in the next step.
0 : 0 : 0 : 0 : 0 : []
^
|_________________
|
scanl incState (map (+1) | ) (repeat 1)
For the next step of mapM
, we need the next item from the list of states. The memory now looks something like
0 : 0 : 0 : 0 : 0 : []
^
|________
|
(:) -> map (+1) |
| ^
| |_________________
v |
scanl incState (map (+1) | ) (repeat 1)
To get the last
item to print
, we force the spine of the list and the last item, but not the other individual items.
0 : 0 : 0 : 0 : 0 : []
^ ^ ^ ^
| | | |
|+1 : |+1 : |+1 : |+1 : 1 : []
^
|_________________
|
scanl incState (map (+1) | ) (repeat 1)
For the next step of mapM
, we need the next item from the list of states. The memory now looks something like
0 : 0 : 0 : 0 : 0 : []
^ ^ ^ ^
| | | |
|+1 : |+1 : |+1 : |+1 : 1 : []
^
|________
|
(:) -> map (+1) |
| ^
| |_________________
v |
scanl incState (map (+1) | ) (repeat 1)
To get the last
item to print
, we force the spine of the list and the last item, but not the other individual items.
0 : 0 : 0 : 0 : 0 : []
^ ^ ^ ^
| | | |
|+1 : |+1 : |+1 : |+1 : 1 : []
^ ^ ^ ^
| | | |
|+1 : |+1 : |+1 : |+1 : 2 : []
^
|_________________
|
scanl incState (map (+1) | ) (repeat 1)
Each time we repeat this, we leave behind another list of thunks, referring to the previous list of thunks, etc. That's where the space leak is coming from. If you force all of the results after each step, like print . sum
would do, there will not be a space leak.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With