How can I avoid in Haskell?

Question

The program below results in <<loop>> in GHC.

...Obviously. In hindsight.

It happens because walk is computing a fixed point, but there are multiple possible fixed points. When the list comprehension reaches the end of the graph-walk, it "asks" for the next element of answer; but that is exactly what it's already trying to compute. I guess I figured the program would get to the, er, end of the list, and stop.

I have to admit, I'm a bit sentimental about this nice code, and wish I could make it work.

What should I do instead?
How can I predict when "tying the knot" (referring to the value inside the expression that says how to compute the value) is a bad idea?

import Data.Set(Set)
import qualified Data.Set

-- Like `Data.List.nub`, remove duplicate elements from a list,
-- but treat some values as already having been seen.
nub :: Set Integer -> [Integer] -> [Integer]
nub _ [] = []
nub seen (x:xs) =
  if Data.Set.member x seen
  then nub seen xs
  else x : nub (Data.Set.insert x seen) xs

-- A directed graph where the vertices are integers.
successors :: Integer -> [Integer]
successors x = [(x + 2) `mod` 7, (x + 3) `mod` 7]

-- Breadth first search of a directed graph.  Returns a list of every integer
-- reachable from a root set in the `successors` graph.
walk :: [Integer] -> [Integer]
walk roots =
  let rootSet = Data.Set.fromList roots
      answer = roots ++ nub rootSet [y | x <- answer, y <- successors x]
  in answer

main = putStrLn $ show $ walk [0]

Daniel Wagner · Accepted Answer

Here's one idea of how to fix it: well, we need a termination condition, right? So let's keep enough structure to know when we should terminate. Specifically, instead of producing a stream of nodes, we'll produce a stream of frontiers, and stop when the current frontier is empty.

import Data.Set(Set)
import qualified Data.Set as S

-- Like `Data.List.nub`, but for nested lists. Order in inner lists is not
-- preserved. (A variant that does preserve the order is not too hard to write,
-- if that seems important.)
nestedNub :: Set Integer -> [[Integer]] -> [[Integer]]
nestedNub _ [] = []
nestedNub seen (xs_:xss) = S.toList xs : nestedNub (seen `S.union` xs) xss where
  xs = S.fromList xs_ `S.difference` seen

-- A directed graph where the vertices are integers.
successors :: Integer -> [Integer]
successors x = [(x + 2) `mod` 7, (x + 3) `mod` 7]

walk :: [Integer] -> [Integer]
walk roots =
  let answer = nestedNub S.empty
        $ roots
        : [[y | x <- frontier, y <- successors x] | frontier <- answer]
  in concat $ takeWhile (not . null) answer

main = print $ walk [0]

There is almost certainly no general algorithm for knowing when tying the knot is a bad idea -- my gut says that's a halting problem thing, though I admit I didn't try to work out the details!

duplode · Answer

Looking at your code suggests we should be able to retrieve at least the root prefix of answer, as it doesn't depend on the knot-tying. And sure enough:

GHCi> take 1 $ walk [0]
[0]

We can even go some way further:

GHCi> take 7 $ walk [0]
[0,2,3,4,5,6,1]

As soon as we ask for the eight element, though, we get stuck:

GHCi> take 8 $ walk [0]
[0,2,3,4,5,6,1

(Interestingly, trying it in GHCi doesn't seem to trip the <<loop>> detector, unlike with a compiled program.)

That the problem only shows up when going beyond the seventh element of a list of unique modulo 7 integers points at the heart of the matter. Removing nub from your definition gives us a perfectly fine infinite list:

walkWithDuplicates :: [Integer] -> [Integer]
walkWithDuplicates roots =
  let rootSet = Data.Set.fromList roots
      answer = roots ++ [y | x <- answer, y <- successors x]
  in answer

GHCi> (!! 9999) $ walkWithDuplicates [0]
2

Using nub on an infinite list is risky business. If the number of distinct elements in it is finite, at some point there will not be a next element to be produced.

What to do, then? If we know in advance the size of the graph, as in your example, we can merrily cheat:

walkKnownSize :: [Integer] -> [Integer]
walkKnownSize roots =
  let graphSize = 7
      rootSet = Data.Set.fromList roots
      answer = roots ++ nub rootSet [y | x <- answer, y <- successors x]
  in take graphSize answer

GHCi> walkKnownSize [0]
[0,2,3,4,5,6,1]

(Note that specifying the graph size wouldn't at all feel like cheating were we passing your graph to the function as a triple of size, roots and an Int -> Integer -> [Integer] successors function.)

In addition to that and to Daniel Wagner's alternative knot-tying strategy, I feel it is worth putting on the table a solution without knot-tying, for the sake of completeness. The implementation below is an unfold that generates the successive levels of the walk (in the spirit of Li Yao Xia's suggestion). That makes it possible to stop once all elements have been visited:

import Data.List (unfoldr)
-- etc.

walkUnfold :: [Integer] -> [Integer]
walkUnfold roots =
    let rootsSet = Data.Set.fromList roots
        nextLevel (previouslySeen, currentLevel) =
            let seen = foldr Data.Set.insert previouslySeen currentLevel
                candidates = concatMap successors currentLevel
                newlyVisited = nub seen candidates
            in case newlyVisited of
                [] -> Nothing
                _ -> Just (newlyVisited, (seen, newlyVisited))
        levels = roots : unfoldr nextLevel (Data.Set.empty, roots)
    in concat levels

How can I avoid <<loop>> in Haskell?

Tags:

haskell

tying-the-knot

fixed-point-iteration

Jason Orendorff

2 Answers

Daniel Wagner

duplode

Recent Activity

Donate For Us