Let's say I want to get a sorted infinite list of all primepowers up to exponent n
.
I have got a function to merge two sorted lists and a function that gives me primes.
merge :: Ord t => [t] -> [t] -> [t]
merge (x:xs) (y:ys)
| (x <= y) = x : merge xs (y:ys)
| otherwise = y : merge (x:xs) ys
merge xs [] = xs
merge [] ys = ys
primes :: [Integer]
primes = sieve [2..]
where
sieve [] = []
sieve (p:xs) = p : sieve (filter (\x -> x `mod` p /= 0) xs)
I have two versions of the listOfPrimepowers
function:
primepowers :: Integer -> [Integer]
primepowers n = foldr (merge) [] (listOfPrimepowers n)
-- terminating
listOfPrimepowers' n = map(\x -> (map(\y -> y ^ x) primes)) [1..n]
-- non terminating
listOfPrimepowers'' n = map(\x -> (map(\y -> x ^ y) [1..n])) primes
One delivers the correct result and the other one doesn't. The only difference is, that the first version maps the primepowers in a way like [[2,3,5,7, ...],[4,9,25,...]]
and the second version maps the primepowers like [[2,4,8],[3,9,27],[5,25,125], ...]
. You see, the infinity is at another level in the list.
Do you have an explanation why the 2nd function doesn't produce any output?
It is caused by foldr merge
having to look through all the lists to find the minimal element at the head of one of the lists. If foldr merge
is given a infinite list of finite lists, foldr merge
cannot ever compute the first element of the list - it keeps looking for the minimal element of the rest of the list of lists, before it can compare it to the first element of the first list - 2
. On the other hand, if foldr merge
is given a finite list of infinite lists, foldr merge
can determine the first element of the merged list, and moves on to the next one. That way you can produce an arbitrary number of elements in the first case, but not a single one in the second case.
Let's expand foldr merge []
:
foldr merge [] (xs0:xs1:xs2:...:[]) =
merge xs0 (merge xs1 (merge xs2 (merge xs3 ... [])))
Clearly, if (xs0:xs1:xs2:...:[])
is infinite, the "nested" calls to merge will form an infinite chain. But what about haskell being "lazy"? primes
is also defined in terms of itself, yet it produces output? Well, there is actually a rule of foldr
: it can produce output for infinite lists only if the function passed to foldr
is not strict in its second argument - i.e. sometimes it can produce output without having to evaluate the result of foldr
for the rest of the list.
The merge
pattern-matches the second argument - that alone can cause non-termination - and uses a strict function <=
, so merge
is strict in the second argument, and the infinite chain will have to evaluate the first element of every list before the top level merge
can produce any result.
Because merge
is strict, and because merge
is associative, you can - and should - use foldl'
instead of foldr
to merge the finite list of infinite lists.
You can make both variants work, fairly easily. The technique involved is worth knowing.
Your code is equivalent to
primepowers n =
foldr merge []
-- terminating:
-- [[p^k | p <- primes] | k <- [1..n]] -- n infinite lists
-- non terminating:
[[p^k | k <- [1..n]] | p <- primes] -- infinite list of n-length lists
"Naturally, taking the minimum of an infinite list is non-terminating" is the truth, and only the truth, but it is not the whole truth, here.
Here, the infinite list primes
is already sorted in increasing order of its elements. So the heads of the lists [p^k | k <- [1..n]]
are sorted and increasing, and the lists themselves are sorted and increasing too. Based on this knowledge alone, we can right away produce the head element of the merged streams — the minimum of the infinite list of heads of all these lists — in O(1) time, without actually inspecting any of them, except the very first one (which is the answer, itself).
Your problem is thus solved by using the following function instead of merge
, in the foldr
expression:
imerge (x:xs) ys = x : merge xs ys
(as seen in the code by Richard Bird in the article by Melissa O'Neill). Now both variants will just work. imerge
is non-strict in its 2nd argument and is naturally used with foldr
, even with finite lists.
Using foldl
instead of foldr
is plain wrong with the 2nd variant, because foldl
on an infinite list is non-terminating.
Using foldl
with the 1st variant, though possible (since the list itself is finite), is suboptimal here: it will place the more frequently-producing streams at the bottom of the overall chain of merges, i.e. it will run slower than the one with foldr
.
see also: folds on wikipedia.
The second version does not give you any output because the input is an infinite list of lists. If you think about it, foldr merge []
creates a sorted list from a list of lists, therefore the head element of the resulting list will be the minimum of all the head elements of the lists. Naturally, taking the minimum of an infinite list is non-terminating, so the function doesn't even get to the point when the first element of the result becomes available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With