Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

An algorithm to see if there are exactly two MSTs in a graph?

I have an undirected connected graph G. I wish to find an algorithm that return true if there are at least 2 MSTs.

What if I want to see if there are exactly 2 MSTs?

like image 594
tom Avatar asked Dec 06 '12 20:12

tom


1 Answers

We can detect both cases efficiently by modifying the Kruskal algorithm. If someone can think of a simpler way to describe all this, please let me know!

Kruskal builds an MST for every permutation of equal-weight edges

The Kruskal algorithm builds an MST by always including the next-smallest edge that connects different components of the forest that has been built so far. The algorithm is correct whenever any such minimal edge is selected, i.e. regardless of how edges having equal weights are ordered.

Kruskal can find every MST

Furthermore, every MST can be produced by choosing some particular way to order every set of equal-weight edges, and then running the Kruskal algorithm. To see this, suppose there was some MST that could not be produced this way. Now subtract some small amount epsilon (smaller than the difference between any pair of unequal edge weights) from the weight of each edge in this MST: this MST is now the unique MST, so Kruskal must produce this MST when run with the new edge weights. But because we only adjusted edges by at most epsilon, when the edges are sorted by weight, the set of all edges having weight w_i - epsilon must appear (in some order) immediately preceding the set of edges having weight w_i (in some order), with no other edges in between the two groups. But this is a valid possible ordering for the original, unmodified edges, and the Kruskal algorithm only cares about the ordering of edges and not their particular weights, so the Kruskal algorithm must have produced the MST from that ordering as well. This contradicts our assumption, so it must be that the Kruskal algorithm can produce every MST.

Component graphs

Call the forest built by the Kruskal algorithm after i >= 0 edge-addition steps F(i), and the edges remaining to be considered and which would not create a cycle R(i). (When an edge is added in step i, we form R(i) by starting with a copy of R(i-1) and deleting the just-added edge and all other edges that joined the same pair of components. Although the Kruskal algorithm actually eliminates these other edges "lazily", defining R(i) this way simplifies proving algorithm properties.) We will break up the Kruskal algorithm into a series of blocks, each of which consists of a sequence of edge additions in which edges of the same weight are added. Call i block-defining if either i = 0, or the minimum-weight edge in R(i) is larger than any edge added in steps 1 .. i.

Suppose that after some block-defining number i >= 0 of edge-addition steps of the Kruskal algorithm have been performed, the smallest edge in R(i) (i.e. the next-smallest edge that would not create a cycle) has weight w. The Kruskal algorithm will proceed to join together all trees having a weight-w edge between them in some way before doing anything else, and even though choosing a different edge ordering for these equal-weight edges can affect what trees are produced, it cannot affect the set of vertices in each tree. Making this more precise:

Define a new, unweighted multigraph (that is, a graph that can have multiple edges between a single pair of vertices) C(i) consisting of a vertex for each component (tree) in the forest F(i). For any vertex v in C(i), call t(v) the tree in F(i) that corresponds to v. Create an edge in C between two vertices u and v whenever a weight-w edge exists in R(i) between some vertex in t(u) and some vertex in t(v). Call C(i) the component graph after i steps.

Lemma: Suppose that for some block-defining number i, C(i) has k components containing at least 1 edge (i.e. components that are not single vertices), and among these components there are m >= 2k vertices in total. Call this set of vertices M. Then regardless of how equal-weight edges have been ordered, after m-k more edge-addition steps of the Kruskal algorithm, the m trees corresponding to the vertices in M will have been merged into k trees, with the jth tree consisting of the union of the trees corresponding to the vertices of the jth component of C(i), plus one or more edges of weight w, for each 1 <= j <= k. In particular, the set of vertices in each of the k resulting trees is unaffected by the particular ordering of the weight-w edges that gave rise to the edges in C(i).

Proof: Every edge (u, v) in C(i) corresponds to a weight-w edge in R(i) that could appear first among all weight-w edges in some permutation of equal-weight edges, and therefore could be chosen next by the Kruskal algorithm. The effect of adding it will be to join together two trees in F(i) into one in F(i+1), and to discard one or more edges from R(i+1). The effect on the component graph will be to merge u and v in C(i) into a single vertex x in C(i+1), delete all other parallel edges between u and v in C(i+1), and change all edges between a third vertex y and either u or v in C(i) into an edge between y and the new vertex x in C(i+1). If an edge in one component of C(i) is chosen next by Kruskal, edges in other components are not affected, so how the edges for different components are interleaved in the permutation has no effect. Therefore we can suppose that all edges for one component are seen first, then all edges for another component, and so on until the kth component. Suppose the first component has s vertices. Each edge added by the Kruskal algorithm reduces the number of vertices of this component by 1 without disconnecting the component. The presence of an edge in the component in C(i+j) indicates the presence of a weight-w edge in R(i+j) that connects two distinct trees in F(i+j), so the Kruskal algorithm will continue to choose edges that shrink this component until it becomes a single vertex in C(i+s-1). Regardless of which edges are chosen at each step, the vertices in the corresponding tree in F(i+s-1) will consist of the union of all vertices from the s trees in F(i). This can be repeated for the remaining components. If there are m vertices across k components in all, then m-k steps are required to shrink each component to a single vertex.

Counting MSTs

Theorem: The number of MSTs is the product of the number of spanning forests in the multigraph C(i) for each block-defining i.

Proof: As established in the lemma, every forest that can be produced by running Kruskal on some permutation of the edges in C(i) is identical with respect to the sets of vertices in each resulting tree component in F(i+m-k). Each spanning tree of an s-vertex component of C(i) represents a distinct set of s-1 edges that can be selected by the Kruskal algorithm to produce a distinct underlying MST that contains the corresponding set of s trees in F(i). A spanning forest is a combination of spanning trees, one for each component, so the number of spanning forests is the product of the number of spanning trees for each contained tree. Call the number of spanning forests in C(i) q(i). Since edge-addition steps in subsequent Kruskal blocks do not care about the edge structure of each component but only about what vertices are in each component, any choice of the q(i) spanning forests for the block starting at step i does not affect the number of spanning forests for the next block starting at step j > i, and all q(i) * q(j) forests are distinct.

There are somewhat complicated algorithms for calculating the number of spanning trees of a graph, such as the one based on Kirchhoff's theorem, given by imslavko. I'm not sure if that one will work for multigraphs. But in any case, since we only care about the particular cases 1, 2 and more than 2, we can take shortcuts.

  • If we only want to detect whether the graph has exactly 1 MST or more than 1, we can use the fact that the only way for a product of integers to be equal to 1 is if every factor is equal to 1: i.e. if any block has more than 1 spanning tree for C(i), immediately stop and report "More than 1". If we get to the end without this happening, report "1".

  • If we want to detect whether the graph has exactly 2 MSTs, we can use the fact that for a product of integers to be equal to 2, exactly 1 of the factors must be equal to 2 and all the rest must be 1. For a multigraph to have exactly 2 spanning trees, it must consist of a forest plus exactly one additional (parallel) edge between two vertices that already have an edge between them. (Any multigraph containing a k-cycle for k >= 3 must have at least k spanning trees, formed by deleting any 1 of the k edges.)

Algorithm overview

Perform Kruskal as usual, but whenever a new block begins (indicated by an edge being added that has higher weight than any previously added edge), before adding the edge, perform the following steps:

  1. Create an empty multigraph, C, using an adjacency-list representation.
  2. Scan forward through all remaining edges having this weight, and for each edge (u, v), add (c(u), c(v)) to C, where c(v) is the representative node of v found by the union/find structure being used to check for connectivity.
  3. Perform a DFS through each component of this multigraph, using an array of flags to record which vertices have already been visited. The purpose of this DFS is to check for cycles and parallel edges:
    • If there any cycles of length 3 or higher, the component has at least 3 spanning trees, meaning the algorithm can terminate at this point.
    • If any parallel edge appears with multiplicity 3 or more, the algorithm can likewise terminate right away.
    • Every time a double-edge is seen between two vertices, increment a global counter: if this counter goes above 1, then at least 2*2 = 4 MSTs exist, so the algorithm can again terminate right away. If we get to the end of the Kruskal algorithm and the counter is at 1, then exactly 2 MSTs exist; otherwise it must be at 0, in which case exactly 1 MST exists.

All of these additional operations take linear time on disjoint blocks of edges, so they won't increase the time complexity of the underlying Kruskal algorithm past O(E log V).

like image 137
j_random_hacker Avatar answered Oct 20 '22 19:10

j_random_hacker