In the topological sort algorithm, we perform DFS and push nodes onto a linked list as we encounter them. The only way I can think of to do this functionally would be do pass the list as a parameter into the calls, which is ugly and very inefficient (i.e., it shows up in the big O since copying the list is O(n). How would one do this idiomatically in Haskell?
The fgl paper discusses this exact question as its first example of how to do graph algorithms in a functional style:
dfs :: [Node] -> Graph a b -> [Node] dfs [] g = [] dfs (v:vs) (c &v g) = v:dfs (suc c++vs) g dfs (v:vs) g = dfs vs g
[Editor's note: the notation
c &v g
is introduced earlier in the paper as a kind of "pattern match on graphs":c &v g
matches graphs containing the vertexv
, bindingc
to a context giving edges in and out of the vertex andg
to the graph with that node and all its edges deleted. In the Haskell library, this is realized by the functionmatch
.]The algorithm works as follows. If there are no nodes left to be visited (first case),
dfs
stops without returning any nodes. In contrast, if there are still nodes that must be visited,dfs
tries to locate the context of the first of these nodes (v
) in the argument graph. If this is possible (second equation), which is the case wheneverv
is contained in the argument graph,v
is the next node on the resulting node list, and the search continues on the remaining graphg
with the successors ofv
to be visited before the remaining list of nodesvs
. The fact that the successors are put in front of all other nodes causesdfs
to favor searching in the depth and not in the breadth. Finally, ifv
cannot be matched (last line),dfs
continues the search with the remaining list of nodesvs
. Note that the last case can only occur ifv
is not contained in the graph because otherwise the pattern in the second equation would have matched.
As copy-and-paste wasn't working properly, I had to transcribe this text; any spelling mistakes are almost certainly mine rather than Martin Erwig's.
As you suggest, the list of nodes left to visit is passed in as a parameter. However, your claim that this is inefficient seems off-target to me; the cost of constructing the new list suc c++vs
is O(length (suc c)
) -- but then you will have to visit all those nodes at least once to complete your depth-first search anyway, so that asymptotic cost is unavoidable.
The full paper linked above spends quite some time discussing asymptotics and efficiency, and is also (in my opinion) quite enlightening, so I encourage you to give it a read.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With