I've compiled this program and am trying to run it.
import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo
collatzLength :: Int -> Int
collatzLength = Memo.arrayRange (1, 1000000) collatzLength'
where
collatzLength' 1 = 1
collatzLength' n | odd n = 1 + collatzLength (3 * n + 1)
| even n = 1 + collatzLength (n `quot` 2)
main = print $ maximumBy (comparing fst) $ [(collatzLength n, n) | n <- [1..1000000]]
I'm getting the following from GHC
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
I assume this is one of the "space overflow" things I've been hearing about. (I'm pretty new to Haskell.) How do I fix it? Do I have to rewrite collatzLength to be tail recursive?
As the author of the code in question, I am now slightly embarrassed because it has not one but two possible stack overflow bugs.
It uses Int
. On a 32-bit system this will overflow, as the Collatz sequence can go quite a bit higher than the starting number. This overflow can cause infinite recursion as the function jumps back and forth between negative and positive values.
In the case of numbers between one and a million, the worst starting point is 704511, which goes as high as 56,991,483,520 before coming back down towards 1. This is well outside the 32-bit range.
It uses maximumBy
. This function is not strict, so it will cause a stack overflow when used on long lists. One million elements is more than enough for this to happen with the default stack size. It still works with optimizations enabled, though, due to the strictness analysis performed by GHC.
The solution is to use a strict version. Since none is available in the standard libraries, we can use the strict left fold ourselves.
Here is an updated version which should (hopefully) be stack overflow-free, even without optimizations.
import Data.List
import Data.Ord
import qualified Data.MemoCombinators as Memo
collatzLength :: Integer -> Integer
collatzLength = Memo.arrayRange (1,1000000) collatzLength'
where
collatzLength' 1 = 1
collatzLength' n | odd n = 1 + collatzLength (3 * n + 1)
| even n = 1 + collatzLength (n `quot` 2)
main = print $ foldl1' max $ [(collatzLength n, n) | n <- [1..1000000]]
Here's a shorter program that fails in the same way:
main = print (maximum [0..1000000])
Yep.
$ ghc --make harmless.hs && ./harmless
[1 of 1] Compiling Main ( harmless.hs, harmless.o )
Linking harmless ...
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
With -O2
it works. What do I make of it? I don't know :( These space mysteries are a serious gotcha.
Edit:
Thx to hammar for pointing out the culprit.
Changing your program to use
maximum' = foldl1' max
Makes it work without -O2
. The implementation of Prelude
's maximum
is lazy and so doesn't quite work for long lists without compiler magic dust.
I think it's most likely that you're hitting integer overflow with some of the Collatz sequences, and then ending up in an "artificial" cycle that contains overflows but never hits 1. That would produce an infinite recursion.
Remember that some Collatz sequences get very much larger than their starting number before they finally (?) end up at 1.
Try to see if it fixes your problem to use Integer
instead of Int
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With