Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Space leak in list program

I am solving some problems of Project Euler in Haskell. I wrote a program for a riddle in it and it did not work as I expected.

When I looked in the task manager when running the program I saw that it was using > 1 gigabyte of RAM on ghc. A friend of me wrote a program with the same meaning in Java and succeeded in 7 seconds.

import Data.List

opl = find vw $ map (\x-> fromDigits (x++[0,0,9]) ) 
        $ sequence [[1],re,[2],re,[3],re,[4],re,[5],re,[6],re,[7],re,[8],re]

vw x = hh^2 == x
    where hh = (round.sqrt.fromIntegral) x

re = [0..9]

fromDigits x = foldl1 (\n m->10*n+m) x

I know this program would output the number I want given enough RAM and time, but there has to be a better-performing way.

like image 732
Ingdas Avatar asked Jul 06 '10 20:07

Ingdas


People also ask

What is memory leakage program?

A memory leak is a program error that consists of repeatedly allocating memory, using it, and then neglecting to free it.

Can memory leaks damage computer?

Memory leaks don't result in physical or permanent damage. Since it's a software issue, it will slow down the applications or even your whole system. However, a program taking up a lot of RAM space doesn't always mean its memory is leaking somewhere. The program you're using may really need that much space.

How do I find a memory leak?

Running out of memory is the simplest way to identify a memory leak, and it's also the most common approach to uncovering one. That's also the most inconvenient way to find a leak. You'll probably notice your system slowing down before you run out of RAM and crash your application.

What is a space leak in Haskell?

A space leak occurs when there exists a point in the computer program where it uses more memory than necessary. Hence, a space leak causes the program to use more space than one would expect. Our primary language of study would be Haskell.


2 Answers

The main problem here is that sequence has a space leak. It is defined like this:

sequence [] = [[]]
sequence (xs:xss) = [ y:ys | y <- xs, ys <- sequence xss ]

so the problem is that the list produced by the recursive call sequence xss is re-used for each of the elements of xs, so it can't be discarded until the end. A version without the space leak is

myseq :: [[a]] -> [[a]]
myseq xs = go (reverse xs) []
 where
  go [] acc = [acc]
  go (xs:xss) acc = concat [ go xss (x:acc) | x <- xs ]

PS. the answer seems to be Just 1229314359627783009

Edit version avoiding the concat:

seqlists :: [[a]] -> [[a]]
seqlists xss = go (reverse xss) [] []
 where
   go []       acc rest = acc : rest
   go (xs:xss) acc rest = foldr (\y r -> go xss (y:acc) r) rest xs

note that both of these versions generate the results in a different order from the standard sequence, so while they work for this problem we can't use one as a specialised version of sequence.

like image 113
Simon Marlow Avatar answered Sep 23 '22 05:09

Simon Marlow


Following on from the answer given by Simon Marlow, here's a version of sequence that avoids the space leak while otherwise working just like the original, including preserving the order.

It still uses the nice, simple list comprehension of the original sequence - the only difference is that a fake data dependency is introduced that prevents the recursive call from being shared.

sequenceDummy d [] = d `seq` [[]]
sequenceDummy _ (xs:xss) = [ y:ys | y <- xs, ys <- sequenceDummy (Just y) xss ]

sequenceUnshared = sequenceDummy Nothing

I think this is a better way of avoiding the sharing that leads to the space leak.

I'd blame the excessive sharing on the "full laziness" transformation. Normally this does a great job of creating sharing that avoids recomputions, but sometimes recompution is very much more efficient than storing shared results.

It'd be nice if there was a more direct way to tell the compiler not to share a specific expression - the above dummy Maybe argument works and is efficient, but it's basically a hack that's just complicated enough that ghc can't tell that there's no real dependency. (In a strict language you don't have these issues because you only have sharing where you explicitly bind a variable to a value.)

like image 29
RD1 Avatar answered Sep 22 '22 05:09

RD1