Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does memoized function consume so much memory in Haskell?

Tags:

haskell

I wrote a piece of Haskell code to calculate the length of a Collatz chain. Given a number n, the next number in the sequence is n/2 if n is even or 3*n+1 if n is odd. The sequence ends when it has converged to 1. I wanted to find the length of the longest chain when starting from any number below some input number.

I tried implementing the length calculation with a memoized function since I expected to need the length of chains starting from some numbers. Thus, the length of the chain starting from 726 would be just 1 + the length of the chain starting from 363, which would already have been calculated. My code is shown below.

collatz :: Int -> Int
collatz n
    | even n = n `div` 2
    | otherwise = 3 * n + 1

collatzLength :: Int -> Int
collatzLength = (fmap len [0 ..] !!)
    where len 0 = 0
          len 1 = 1
          len n = 1 + (collatzLength . collatz $ n)

maxLengthBelow :: Int -> Int
maxLengthBelow = foldl1 max . fmap collatzLength . enumFromTo 1

main :: IO()
main = print $ maxLengthBelow 10000

This code works, but takes a huge amount of memory. When profiling it, running main with the input of 10000, len is only called 21664 times, as expected, but the program takes 16 seconds and 4.5Gb of memory! What is taking up all that memory? I would have expected the memoized function to produce a fast, low-memory solution.

like image 881
mentoc3000 Avatar asked May 14 '19 13:05

mentoc3000


People also ask

Does Haskell do memoization?

Memoization is an optimization technique used to speed up a function by caching its previously computed results. In impure programming languages, a mutable map is used to cache computed results.

Does Haskell automatically Memoize?

Most Haskell compilers and interpreters (all of them that I know of actually) do not memoize polymorphic structures, so m2's internal list is recreated every time it's called, where m1's is not.

What is a Memoized function?

In programming, memoization is an optimization technique that makes applications more efficient and hence faster. It does this by storing computation results in cache, and retrieving that same information from the cache the next time it's needed instead of computing it again.

Is memoization a cache?

Is memoization same as caching? Yes, kind of. Memoization is actually a specific type of caching. While caching can refer in general to any storing technique (like HTTP caching) for future use, memoizing specifically involves caching the return values of a function .


1 Answers

One of the things that makes the Collatz sequence so fun is that there are some small starting seeds that take you way, way out into the atmosphere on their way to 1. In particular, 9663 makes it all the way out to 27114424 before it collapses -- and that's a long memoization list to build!

And, for what it's worth, I expect your memoization list to use three machine words per element: one for the I# constructor on the Int, one for the contained number, and one for the (:) constructor. Let's ask how much space it would take to store 27114424 elements, then:

> 27114424 * (64*3) / 1024 {-Kb-} / 1024 {-Mb-} / 1024 {-Gb-}
4.8484368324279785

So 4.5Gb sounds about right, perhaps even a little low.

like image 153
Daniel Wagner Avatar answered Oct 04 '22 01:10

Daniel Wagner