Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

haskell running out of memory with finite lists

Tags:

memory

haskell

I run out of memory trying to run moderate inputs such as this:

variation_models 15 25

also running higher numbers for ncars seems to make a huge difference in speed and memory usage.

The slowdown is expected (there are more things to compare), but the exponential increase of memory usage doesn't make sense to me

import Control.Monad

orderedq f [] = True
orderedq f (x:[]) = True
orderedq f (x:y:zs) = f x y && orderedq f (y:zs)

num_orderedq = orderedq (<=)

adds_up_to n xs = n == sum xs

both_conditions f g xs = f xs && g xs

variation_models ncars nlocations =
  filter (both_conditions (adds_up_to nlocations) num_orderedq) $ replicateM ncars [1..nlocations-ncars+1]

What is causing the large difference in memory usage? replicateM?

like image 867
SultanLegend Avatar asked Mar 03 '23 10:03

SultanLegend


1 Answers

I think you've seen elsewhere that your specific problem (creating ordered lists of integers that sum to a given number) is better solved using an alternative algorithm, rather than filtering a huge list of lists of integers.

However, getting back to your original issue, it is possible to construct an equivalent of:

replicateM p [1..n]

that runs in exponential time (of course) but constant space.

The problem is that this expression is more or less equivalent to the recursion:

badPower 0 _ = pure []
badPower p n = [x:xs | x <- [1..n], xs <- badPower (p-1) n]

So, in the list comprehension, for each selected x, the whole list badPower (p-1) n needs to be re-generated from the start. GHC, sensibly enough, decides to keep badPower (p-1) n around so it doesn't need to be recomputed each time. So, the badPower p n call needs the entire badPower (p-1) n list kept in memory, which already accounts for n^(p-1) elements and exponential memory use, even without considering badPower (p-2) n, etc.

If you just flip the order of the implicit loops around:

goodPower 0 _ = pure []
goodPower p n = [x:xs | xs <- goodPower (p-1) n, x <- [1..n]]

That fixes the problem. Even though the list goodPower (p-1) n is "big", we take it's first element, use it n times for each value of x and then can discard it and move to the next element. So, goodPower (p-1) n can be garbage collected as it's used.

Note that goodPower generates the elements in a different order than badPower, with the first coordinate of the lists varying fastest, instead of the last. (If this matters, you can map reverse $ goodPower .... While reverse is "slow", it's only being applied to short lists here.)

Anyway, the following program runs (practically) forever, but does so in constant space:

power :: Int -> [a] -> [[a]]
power 0 _ = [[]]
power p lst = [x:xs | xs <- power (p-1) lst, x <- lst ]

main = do
  print $ length (power 15 [1..11])
like image 65
K. A. Buhr Avatar answered Mar 05 '23 16:03

K. A. Buhr