This simple program runs in constant memory space when compiled with no flags with ghc:
import Data.List
f x = x*x
g a = foldl' (+) (f a) [1..(1073741824-1)]
main = do putStrLn $ show $ foldl' (+) 0 $ map g [0,1]
When compiled with ghc -O2 the memory usage exceeds the system resources (8GB).
Changing main to:
main = do putStrLn $ show $ foldl' (+) 0 [g 0, g 1]
alleviates the problem so it appears to be something to do with the map.
Can anyone explain the behaviour and perhaps how to work around it?
GHC version is: Glasgow Haskell Compiler, Version 7.4.1, stage 2 booted by GHC version 7.4.1
This is the full laziness "optimization" biting you. When it works right, it can provide an asymptotic improvement in run time. When it works wrong... This happens.
The full laziness transform moves constants out of bindings to the enclosing scope. In this case, it's picking up on the [1..(1073741824-1)]
constant, and moving it to the enclosing scope. However, that's.. completely wrong. It causes it to be shared between the two calls, meaning that it can't stream efficiently the first time.
There is no reliable way to defeat the full laziness transformation, except for compiling without -O2.
The reason was explained by Carl really good and I don't think I can help you there any more.
But I can show you how to work around this.
First your g
really just does sum up to 1073741823 and add f a
.
There is a simple formula to sum up the numbers from 1 to n
without that many additions (Gauss is said to have found it in primary-school):
sumUp :: (Integral a) => a -> a
sumUp n = n * (n+1) `div` 2
with this you can write
f x = x*x
g a = sumUp (1073741824-1) + f a
main = do putStrLn $ show $ foldl' (+) 0 $ map g [0,1]
You can hava a look at the link to see intuitively why this holds or try to find the proof yourself - it's really easy using induction :D
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With