Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Infinite list in haskell combined with fold* doesn't calculate

Let's say I want to get a sorted infinite list of all primepowers up to exponent n.

I have got a function to merge two sorted lists and a function that gives me primes.

merge :: Ord t => [t] -> [t] -> [t]
merge (x:xs) (y:ys)
    | (x <= y) = x : merge xs (y:ys)
    | otherwise = y : merge (x:xs) ys
merge xs [] = xs
merge [] ys = ys

primes :: [Integer]
primes = sieve [2..]
    where
        sieve [] = []
        sieve (p:xs) = p : sieve (filter (\x -> x `mod` p /= 0) xs)

I have two versions of the listOfPrimepowers function:

primepowers :: Integer -> [Integer]
primepowers n = foldr (merge) [] (listOfPrimepowers n)    

-- terminating
listOfPrimepowers' n = map(\x -> (map(\y -> y ^ x) primes)) [1..n]

-- non terminating    
listOfPrimepowers'' n = map(\x -> (map(\y -> x ^ y) [1..n])) primes

One delivers the correct result and the other one doesn't. The only difference is, that the first version maps the primepowers in a way like [[2,3,5,7, ...],[4,9,25,...]] and the second version maps the primepowers like [[2,4,8],[3,9,27],[5,25,125], ...]. You see, the infinity is at another level in the list.

Do you have an explanation why the 2nd function doesn't produce any output?

like image 215
Benji Wa Avatar asked Apr 04 '14 08:04

Benji Wa


3 Answers

It is caused by foldr merge having to look through all the lists to find the minimal element at the head of one of the lists. If foldr merge is given a infinite list of finite lists, foldr merge cannot ever compute the first element of the list - it keeps looking for the minimal element of the rest of the list of lists, before it can compare it to the first element of the first list - 2. On the other hand, if foldr merge is given a finite list of infinite lists, foldr merge can determine the first element of the merged list, and moves on to the next one. That way you can produce an arbitrary number of elements in the first case, but not a single one in the second case.


Let's expand foldr merge []:

foldr merge [] (xs0:xs1:xs2:...:[]) =
merge xs0 (merge xs1 (merge xs2 (merge xs3 ... [])))

Clearly, if (xs0:xs1:xs2:...:[]) is infinite, the "nested" calls to merge will form an infinite chain. But what about haskell being "lazy"? primes is also defined in terms of itself, yet it produces output? Well, there is actually a rule of foldr: it can produce output for infinite lists only if the function passed to foldr is not strict in its second argument - i.e. sometimes it can produce output without having to evaluate the result of foldr for the rest of the list.

The merge pattern-matches the second argument - that alone can cause non-termination - and uses a strict function <=, so merge is strict in the second argument, and the infinite chain will have to evaluate the first element of every list before the top level merge can produce any result.

Because merge is strict, and because merge is associative, you can - and should - use foldl' instead of foldr to merge the finite list of infinite lists.

like image 119
Sassa NF Avatar answered Oct 05 '22 18:10

Sassa NF


You can make both variants work, fairly easily. The technique involved is worth knowing.

Your code is equivalent to

primepowers n = 
  foldr merge [] 
      -- terminating:
 --   [[p^k | p <- primes]  | k <- [1..n]]  -- n infinite lists

      -- non terminating:
      [[p^k | k <- [1..n]]  | p <- primes]  -- infinite list of n-length lists

"Naturally, taking the minimum of an infinite list is non-terminating" is the truth, and only the truth, but it is not the whole truth, here.

Here, the infinite list primes is already sorted in increasing order of its elements. So the heads of the lists [p^k | k <- [1..n]] are sorted and increasing, and the lists themselves are sorted and increasing too. Based on this knowledge alone, we can right away produce the head element of the merged streams — the minimum of the infinite list of heads of all these lists — in O(1) time, without actually inspecting any of them, except the very first one (which is the answer, itself).

Your problem is thus solved by using the following function instead of merge, in the foldr expression:

imerge (x:xs) ys = x : merge xs ys

(as seen in the code by Richard Bird in the article by Melissa O'Neill). Now both variants will just work. imerge is non-strict in its 2nd argument and is naturally used with foldr, even with finite lists.

Using foldl instead of foldr is plain wrong with the 2nd variant, because foldl on an infinite list is non-terminating.

Using foldl with the 1st variant, though possible (since the list itself is finite), is suboptimal here: it will place the more frequently-producing streams at the bottom of the overall chain of merges, i.e. it will run slower than the one with foldr.

see also: folds on wikipedia.

like image 43
Will Ness Avatar answered Oct 05 '22 18:10

Will Ness


The second version does not give you any output because the input is an infinite list of lists. If you think about it, foldr merge [] creates a sorted list from a list of lists, therefore the head element of the resulting list will be the minimum of all the head elements of the lists. Naturally, taking the minimum of an infinite list is non-terminating, so the function doesn't even get to the point when the first element of the result becomes available.

like image 21
András Kovács Avatar answered Oct 05 '22 19:10

András Kovács