I have quite a big problem with inventing an algorithm.
I have two groups of nodes. Say, we have 4 nodes in the first group and another 4 nodes in the second one. The following figure illustrates this:
I want to connect nodes from the first group with the nodes from the second group. The algorithm should find an ideal solution where the lengths of all connections must be as similar as possible and each node from one group can be connected just to a single node from the second group.
An ideal solution is demonstrated in the following picture:
The following picture demonstrates not an ideal solution, as the lengths of those connections differ too much.
The following table shows the lengths between each node. From this table, I want to find the ideal solution. The solution is encircled.
And how can I find the best solution when I have groups with a hundred of nodes? If I try every combination, there will be 100! of combinations, and that's a lot. I am not able to come up with an algorithm with a reasonable complexity.
As Thomas@ asked in the comments, the solution of the problem depends on your definition of "as close as possible". Let's define the set of edges in the solution to be S
. Assuming the definition was chosen to be minimizing max(S) - min(S)
(which is a reasonable assumption in practice), then this problem can definitely be solved in polynomial time.
Here's an example of the solution. For 100 nodes it will only take seconds.
First of all, you must know the maximum matching problem for bipartite graphs, and that it can be solved in polynomial time. The most straight-forward Hungarian Algorithm takes O(nm)
in which n
is the number of nodes and m
is the number of edges. And for planar graph which is your case, there are more sophisticated algorithms that can further improve the performance.
Let's define this to be a function Max_Match(X)
, which returns the maximum number of matches in edge set X
.
This Max_Match
can be calculated in less than O(n^1.5)
, in which n
is the number of nodes. For n=100
, we have only n^1.5 = 1000
.
Then let's convert your problem to the maximum matching problem.
Let's define the set of edges E
to all the edges we can pick from. In your case there are n*n
edges in E
, where n
is the number of nodes on one side.
Let's define function
F(E, low, high) = {e | e in E and |e| >= low and |e| <= high }
which means the subset edges of E
whose lengths are between [low, high]
Then for a given pair of numbers low
and high
, your problem has a solution if and only if
Max_Match( F (E, low, high)) == n
However, how do we figure out the value of low
and high
?
The possible values of low
and high
are every possible numbers in {|e|, e in E}
, which has n^2
numbers. So trying all the possible combinations of low
and high
will cost n^4
. That's a lot.
If we already know low
, can we figure out high
without enumeration? The answer is positive. Notice that :
If for a number h
we have
Max_Match( F (E, low, h)) == n
Then for every number h' >= h
we also have
Max_Match( F (E, low, h')) == n
What this means is that we can use binary search to figure out high
once we fix low
.
So the framework of the final solution:
arr[] = {|e|, e in E}.sort()
for i in range(0, len(arr[])):
low = i
h_left = i, h_right = len(arr[])-1
while h_left <= h_right:
mid = (h_left+h_right)/2
if Max_Match( F( E, arr[low], arr[mid]))==n:
h_right = mid - 1
else:
h_left = mid + 1
if h_left >= low:
if arr[h_left] - arr[low] <= ans:
ans = arr[h_left] - arr[low]
And ans
will be the minimal difference between the edges in the solution. This algorithm costs O(n^3.5 * log(n))
, in which n
is the number of nodes.
Edit: a simple and ugly implementation of above algorithm in c++ can be found at:
ideone.com/33n2Tg . Because I'm short of time, I handcrafted a simple Hungarian algorithm in this solution, which is slow when n
is large. To achieve better performance you will need the algorithm I included in part 1 for planar graphs.
I don't think you can easily calculate that with A*. Also a brute force would be O(n!). I think we should use heuristics like genetic algorithm.
You can implement the gen with an array (point A index is connected to the value point B)
Then you have to generate an gene pool with random genes. The number of genes is not fix... so you have to test a little bit around.
you will need a calculation method:
float GetLength(int indexA, int indexB)
{
return sqrt((pointA[indexA].x - pointB[indexB].x) * (pointA[indexA].x - pointB[indexB].x) + (pointA[indexA].y - pointB[indexB].y) * (pointA[indexA].y - pointB[indexB].y));
}
then you also need a fitness function:
float Fitness(int gene, int count)
{
float min = GetLength(0,gene[0]);
float max = GetLength(0,gene[0]);
for(int i = 0; i< count;i++)
{
min = std::min(min, GetLength(i,gene[i]));
max = std::max(max, GetLength(i,gene[i]));
}
return max-min;
}
Also you need to pair them:
int* Crossbreed(int* gene1, int* gene2, int count)
{
int* geneNew = new int[count];
for(int i = 0; i< count; i++)
{
geneNew[gene1[i]] = gene2[i];
}
return geneNew;
}
The Crossbreed gives us back some kids ;) which has to pass the fitness test. Also you should implement a mutate method, which gives us a mutated gene. This is important to leave local maximal.
At the end you only have to run this in a loop for some time and you will get a extremely good result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With