Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell Space Overflow

Tags:

haskell

ghc

I've compiled this program and am trying to run it.

import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo

collatzLength :: Int -> Int
collatzLength = Memo.arrayRange (1, 1000000) collatzLength'
  where
    collatzLength' 1 = 1
    collatzLength' n | odd n  = 1 + collatzLength (3 * n + 1)
                     | even n = 1 + collatzLength (n `quot` 2)

main = print $ maximumBy (comparing fst) $ [(collatzLength n, n) | n <- [1..1000000]]

I'm getting the following from GHC

Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

I assume this is one of the "space overflow" things I've been hearing about. (I'm pretty new to Haskell.) How do I fix it? Do I have to rewrite collatzLength to be tail recursive?

like image 321
afkbowflexin Avatar asked Aug 21 '11 19:08

afkbowflexin


3 Answers

As the author of the code in question, I am now slightly embarrassed because it has not one but two possible stack overflow bugs.

  1. It uses Int. On a 32-bit system this will overflow, as the Collatz sequence can go quite a bit higher than the starting number. This overflow can cause infinite recursion as the function jumps back and forth between negative and positive values.

    In the case of numbers between one and a million, the worst starting point is 704511, which goes as high as 56,991,483,520 before coming back down towards 1. This is well outside the 32-bit range.

  2. It uses maximumBy. This function is not strict, so it will cause a stack overflow when used on long lists. One million elements is more than enough for this to happen with the default stack size. It still works with optimizations enabled, though, due to the strictness analysis performed by GHC.

    The solution is to use a strict version. Since none is available in the standard libraries, we can use the strict left fold ourselves.

Here is an updated version which should (hopefully) be stack overflow-free, even without optimizations.

import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo

collatzLength :: Integer -> Integer
collatzLength = Memo.arrayRange (1,1000000) collatzLength'
  where
    collatzLength' 1 = 1
    collatzLength' n | odd n  = 1 + collatzLength (3 * n + 1)
                     | even n = 1 + collatzLength (n `quot` 2)

main = print $ foldl1' max $ [(collatzLength n, n) | n <- [1..1000000]]
like image 188
hammar Avatar answered Sep 23 '22 13:09

hammar


Here's a shorter program that fails in the same way:

main = print (maximum [0..1000000])

Yep.

$ ghc --make harmless.hs && ./harmless
[1 of 1] Compiling Main             ( harmless.hs, harmless.o )
Linking harmless ...
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.

With -O2 it works. What do I make of it? I don't know :( These space mysteries are a serious gotcha.

Edit:

Thx to hammar for pointing out the culprit.

Changing your program to use

maximum' = foldl1' max

Makes it work without -O2. The implementation of Prelude's maximum is lazy and so doesn't quite work for long lists without compiler magic dust.

like image 43
yairchu Avatar answered Sep 22 '22 13:09

yairchu


I think it's most likely that you're hitting integer overflow with some of the Collatz sequences, and then ending up in an "artificial" cycle that contains overflows but never hits 1. That would produce an infinite recursion.

Remember that some Collatz sequences get very much larger than their starting number before they finally (?) end up at 1.

Try to see if it fixes your problem to use Integer instead of Int.

like image 36
hmakholm left over Monica Avatar answered Sep 25 '22 13:09

hmakholm left over Monica