Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does (sum $ takeWhile (<10000000) [1..]) use so much memory?

Tags:

haskell

This is what I get in the console:

ghci> sum $ takeWhile (<10000000) [1..]
49999995000000
(11.96 secs, 2174569400 bytes)

That's over 2GB! I would imagine that sum can discard whatever it has already summed. How would you write this?

like image 439
Gunchars Avatar asked Jan 12 '13 23:01

Gunchars


1 Answers

You're creating ten million Integers, and a lot of list cells. Also, you're running interpreted code, if you ran it through the compiler, that would reduce the allocation somewhat.

The main problem is that the interpreter doesn't optimise at all, so the sum uses the lazy variant that builds a huge thunk. sum discards the part of the list it has consumed just fine, but it replaces it with a thunk to compute the result afterward, so

sum [1,2,3,4 ...]

becomes

(...((((0 + 1) + 2) + 3) + 4) + ...)

afterward. That's not the optimal substitution, since addition of Integers is strict.

At the ghci prompt, you should write

Prelude Data.List> foldl' (+) 0 $ takeWhile (< 10000000) [1 .. ]
49999995000000
(1.41 secs, 1443355832 bytes)

to fix that. In a compiled (with optimisations, of course) programme, sum will work fine.

like image 166
Daniel Fischer Avatar answered Dec 04 '22 18:12

Daniel Fischer