Hi haskell fellows. I'm currently working on the 23rd problem of Project Euler. Where I'm at atm is that my code seems right to me - not in the "good algorithm" meaning, but in the "should work" meaning - but produces a Stack memory overflow.
I do know that my algorithm isn't perfect (in particular I could certainly avoid computing such a big intermediate result at each recursion step in my worker
function).
Though, being in the process of learning Haskell, I'd like to understand why this code fails so miserably, in order to avoid this kind of mistakes next time.
Any insight on why this program is wrong will be appreciated.
import qualified Data.List as Set ((\\))
main = print $ sum $ worker abundants [1..28123]
-- Limited list of abundant numbers
abundants :: [Int]
abundants = filter (\x -> (sum (divisors x)) - x > x) [1..28123]
-- Given a positive number, returns its divisors unordered.
divisors :: Int -> [Int]
divisors x | x > 0 = [1..squareRoot x] >>=
(\y -> if mod x y == 0
then let d = div x y in
if y == d
then [y]
else [y, d]
else [])
| otherwise = []
worker :: [Int] -> [Int] -> [Int]
worker (a:[]) prev = prev Set.\\ [a + a]
worker (a:as) prev = worker as (prev Set.\\ (map ((+) a) (a:as)))
-- http://www.haskell.org/haskellwiki/Generic_number_type#squareRoot
(^!) :: Num a => a -> Int -> a
(^!) x n = x^n
squareRoot :: Int -> Int
squareRoot 0 = 0
squareRoot 1 = 1
squareRoot n =
let twopows = iterate (^!2) 2
(lowerRoot, lowerN) =
last $ takeWhile ((n>=) . snd) $ zip (1:twopows) twopows
newtonStep x = div (x + div n x) 2
iters = iterate newtonStep (squareRoot (div n lowerN) * lowerRoot)
isRoot r = r^!2 <= n && n < (r+1)^!2
in head $ dropWhile (not . isRoot) iters
Edit: the exact error is Stack space overflow: current size 8388608 bytes.
. Increasing the stack memory limit through +RTS -K...
doesn't solve the problem.
Edit2: about the sqrt thing, I just copy pasted it from the link in comments. To avoid having to cast Integer to Doubles and face the rounding problems etc...
In the future, it's polite to attempt a bit of minimalization on your own. For example, with a bit of playing, I was able to discover that the following program also stack-overflows (with an 8M stack):
main = print (worker [1..1000] [1..1000])
...which really nails down just what function is screwing you over. Let's take a look at worker
:
worker (a:[]) prev = prev Set.\\ [a + a]
worker (a:as) prev = worker as (prev Set.\\ (map ((+) a) (a:as)))
Even on my first read, this function was red-flagged in my mind, because it's tail-recursive. Tail recursion in Haskell is generally not such a great idea as it is in other languages; guarded recursion (where you produce at least one constructor before recursing, or recurse some small number of times before producing a constructor) is generally better for lazy evaluation. And in fact, here, what's happening is that each recursive call to worker
is building a deeper- and deeper-ly nested thunk in the prev
argument. When the time comes to finally return prev
, we have to go very deeply into a long chain of Set.\\
calls to work out just what it was we finally have.
This problem is obfuscated slightly by the fact that the obvious strictness annotation doesn't help. Let's massage worker
until it works. The first observation is that the first clause is completely subsumed by the second one. This is stylistic; it shouldn't affect the behavior (except on empty lists).
worker [] prev = prev
worker (a:as) prev = worker as (prev Set.\\ map (a+) (a:as))
Now, the obvious strictness annotation:
worker [] prev = prev
worker (a:as) prev = prev `seq` worker as (prev Set.\\ map (a+) (a:as))
I was surprised to discover that this still stack overflows! The sneaky thing is that seq
on lists only evaluates far enough to learn whether the list matches either []
or _:_
. The following does not stack overflow:
import Control.DeepSeq
worker [] prev = prev
worker (a:as) prev = prev `deepseq` worker as (prev Set.\\ map (a+) (a:as))
I didn't plug this final version back into the original code, but it at least works with the minimized main
above. By the way, you might like the following implementation idea, which also stack overflows:
import Control.Monad
worker as bs = bs Set.\\ liftM2 (+) as as
but which can be fixed by using Data.Set
instead of Data.List
, and no strictness annotations:
import Control.Monad
import Data.Set as Set
worker as bs = toList (fromList bs Set.\\ fromList (liftM2 (+) as as))
Ok, I loaded it up and gave it a shot. Daniel Wagner's advice is pretty good, probably better than mine. The problem is indeed with the worker function, but I was going to suggest using Data.MemoCombinators to memoize your function instead.
Also, your divisors algorithm is kind of silly. There's a much better way to do that. It's kind of mathy and would require a lot of TeX, so here's a link to a math.stackexchange page about how to do that. The one I was talking about, was the accepted answer, though someone else gives a recursive solution that I think would run faster. (It doesn't require prime factorization.)
https://math.stackexchange.com/questions/22721/is-there-a-formula-to-calculate-the-sum-of-all-proper-divisors-of-a-number
As Daniel Wagner correctly said, the problem is that
worker (a:as) prev = worker as (prev Set.\\ (map ((+) a) (a:as)))
builds a badly nested thunk. You can avoid that and get somewhat better performance than with deepseq
by exploiting the fact that both arguments to worker
are sorted in this application. Thus you can get incremental output by noting that at any step everything in prev
smaller than 2*a
cannot be the sum of two abundant numbers, so
worker (a:as) prev = small ++ worker as (large Set.\\ map (+ a) (a:as))
where
(small,large) = span (< a+a) prev
does better. However, it's still bad because (\\)
cannot use the sortedness of the two lists. If you replace it with
minus xxs@(x:xs) yys@(y:ys)
= case compare x y of
LT -> x : minus xs yys
EQ -> minus xs ys
GT -> minus xxs ys
minus xs _ = xs -- originally forgot the case for one empty list
(or use the data-ordlist package's version), calculating the set-difference is O(length) instead of O(length^2).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With