Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Avoiding duplicates in breadth-first search

For educational purposes I've recently been implementing common algorithms in Haskell. Currently I'm stuck on Breadth-First Search. This is my implementation, with nodes being represented as just integers for simplicity:

import qualified Data.Map as M
import qualified Data.List as L

type Node = Int
type Graph = M.Map Node [Node]

-- Returns list of nodes adjacent to n in graph g
adjacent :: Node -> Graph -> [Node]
adjacent n g = M.findWithDefault [] n g

-- Returns graph g with all instances of n removed
rip :: Node -> Graph -> Graph
rip n g = M.delete n (M.map (L.delete n) g)

bfs :: Node -> Graph -> [Node]
bfs n g = [n] ++ _bfs [n] g

_bfs :: [Node] -> Graph -> [Node]
_bfs (n:ns) g =
    if not (M.null g) then
        let layer = adjacent n g in
            layer ++ _bfs (ns ++ layer) (rip n g)
    else n:ns
_bfs [] g = []

(There are other functions for actually constructing the graph, but I left them out for brevity's sake)

The result of calling bfs would be a correct breadth-first traversal of the graph, if not for the fact that some graphs produce duplicates, such as this one:

The graph

(The result of bfs 1 g for g = this graph is [1,2,3,4,4,5,6,7,7,7])

My current solution boils down to changing the relevant line in _bfs to L.nub $ layer ++ _bfs (ns ++ layer) (rip n g), but that seems incredibly hackish, and I'm not sure if it'll produce a correct breadth-first traversal. Aside from constantly checking n:ns for duplicates before inserting (which sounds horribly inefficient), I have no other ideas.

How can I rewrite _bfs (or more) so that it does not produce duplicates by definition?

like image 775
More Axes Avatar asked Jul 12 '14 13:07

More Axes


People also ask

Which is more efficient BFS or DFS?

DFS is faster than BFS. Time Complexity of BFS = O(V+E) where V is vertices and E is edges. Time Complexity of DFS is also O(V+E) where V is vertices and E is edges.

What is the advantage of DFS over BFS?

BFS can be used to find a single source shortest path in an unweighted graph because, in BFS, we reach a vertex with a minimum number of edges from a source vertex. In DFS, we might traverse through more edges to reach a destination vertex from a source.

How do I speed up breadth first search?

You can speed up BFS from a source to a target by doing a bi-directional search. A bi-directional search is basically doing a BFS from the source and from the target at the same time, one step from each - until the two fronts meet each other.


1 Answers

You should use a set of visited nodes instead of rip.

First, rip takes linear time in the number of remaining edges, which makes the whole breadth-first traversal quadratic.

Second, the no-duplicates traversal is not practical with rip. Currently, duplicate nodes are added because the same nodes can be visited from multiple nodes of the current frontier of traversal. The revisits can't be simply pruned with rip because it removes the node from the graph altogether, but we still need the node in order to continue the traversal.

Here's an example with a visited set in a State monad (which is nice here, since we can build up the traversal frontier by frontier, and filterM from Control.Monad is handy for, well, filtering out visited nodes):

import qualified Data.IntMap.Strict as IM
import qualified Data.IntSet as IS

import Control.Monad
import Control.Monad.State.Strict

type Node = Int
type Graph = IM.IntMap [Node]

bfs :: Node -> Graph -> [Node]
bfs n g = evalState (go [n]) (IS.singleton n) where

    go :: [Node] -> State IS.IntSet [Node]
    go [] = return []
    go ns = do
        ns' <- flip filterM ((g IM.!) =<< ns) $ \n' -> do
            notVisited <- gets (IS.notMember n')
            when notVisited $ modify (IS.insert n')
            return notVisited
        (ns++) `fmap` go ns'

-- your example graph
graph :: Graph
graph = IM.fromList $ [
      (1, [2, 3])
    , (2, [1, 4])
    , (3, [1, 4])
    , (4, [2, 5, 3, 6])
    , (5, [4, 7])
    , (6, [4, 7])
    , (7, [5, 6])]

main = print $ bfs 1 graph -- [1, 2, 3, 4, 5, 6, 7]

Here's is an implementation of the same algorithm without State, instead using foldr to pass along the updated visited set:

bfs' :: Node -> Graph -> [Node]
bfs' start graph = go [start] (IS.singleton start) where

    go :: [Node] -> IS.IntSet -> [Node]
    go [] _       = []
    go ns visited = ns ++ go ns' visited' where

        newNodes = [n' | n <- ns, n' <- graph IM.! n]

        step n (acc, visited) 
            | IS.member n visited = (acc, visited)
            | otherwise = (n:acc, IS.insert n visited) 

        (ns', visited') = foldr step ([], visited) newNodes 
like image 131
András Kovács Avatar answered Sep 28 '22 11:09

András Kovács