Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Application of Union+Find algorithm(Disjoint Set)

Problem Statement:

Equations are given in the format A / B = k, where A and B are variables represented as strings, and k is a real number (floating point number).

Given some queries, return the answers. If the answer does not exist, return -1.0.

Example: Given a / b = 2.0, b / c = 3.0.

queries are: a / c = ?, b / a = ?, a / e = ?, a / a = ?, x / x = ? return [6.0, 0.5, -1.0, 1.0, -1.0 ]

The input is:

vector<pair<string, string>> equations
vector<double>& values
vector<pair<string, string>> queries 

where equations.size() == values.size(), and the values are positive.

This represents the equations.

Return vector<double>.

According to the example above: equations = [ ["a", "b"], ["b", "c"] ]

values = [2.0, 3.0]

queries = [ ["a", "c"], ["b", "a"], ["a", "e"], ["a", "a"], ["x", "x"] ]

The input is always valid. You may assume that evaluating the queries will result in no division by zero and there is no contradiction.

Solution This can be solved using Union+Find on disjoint set, a solution is seen here:

Solution

However, I'm not clear on the intuition behind line 59:

rst[i] = uf.rank.get(queries[i][0]) / uf.rank.get(queries[i][1]);

As well as line 99:

rank.put(aFather, quotient * rank.get(b) / rank.get(a));
like image 998
user1008636 Avatar asked Aug 23 '18 15:08

user1008636


1 Answers

It's not hard to follow what happens. Pretty clever in fact!

Let's take a more complex example:

a / b = 2.0, b / c = 3.0, c / d = 4.0, d / e = 5.0

During the first step (MakeSet triggered by UnionFind uf = new UnionFind(set)) each element is set to be its own parent, and all ranks are set to 1.0:

parent(a) = a, rank(a) = 1.0
...
parent(e) = e, rank(e) = 1.0

During the Union step, the rank of the node is set to the given quotient, while the rank of the parent stays the same (line 99). So after union(a, b, 2.0) parent(a) = b, rank(a) = 2.0 and the invariant is maintained for any node n: rank(n)/rank(parent(n)) = value, where value is from the equation being processed (the quotient argument). At the end we get:

parent(a) = b, rank(a) = 2.0
parent(b) = c, rank(b) = 3.0
parent(c) = d, rank(c) = 4.0
parent(d) = e, rank(d) = 5.0
parent(e) = e, rank(e) = 1.0

During the Compress step, if the parent of the node being searched is not the representative node of the set, then it is set to be by recursively searching for the parent of the parent of the parent... and setting the rank of the current node as the current rank multiplied by the rank of the parent (line 87). So in the end we arrive at:

parent(a) = e, rank(a) = 120.0
parent(b) = e, rank(b) = 60.0
parent(c) = e, rank(c) = 20.0
parent(d) = e, rank(d) = 5.0
parent(e) = e, rank(e) = 1.0

So indeed rank(a) = rank(b) * 2.0, rank(b) = rank(c) * 3.0 etc. just as given in the input equations.

Note that the representative node of a set (i.e. the ultimate parent, e in this example) always ends-up having a rank of 1.0. This is why repeatedly calling compressedFind and executing line 87 doesn't further change the rank of a node, once it's been computed and the parent been set.

Now it's easy to see how line 59 works: if the query is a / b then rank(a) / rank(b) = 120.0 / 60.0 = 2.0

Terminology used from here: https://en.wikipedia.org/wiki/Disjoint-set_data_structure

like image 137
memo Avatar answered Nov 14 '22 13:11

memo