Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding edges in a graph with similar lengths

Tags:

algorithm

I have quite a big problem with inventing an algorithm.

I have two groups of nodes. Say, we have 4 nodes in the first group and another 4 nodes in the second one. The following figure illustrates this:

Two groups of nodes

I want to connect nodes from the first group with the nodes from the second group. The algorithm should find an ideal solution where the lengths of all connections must be as similar as possible and each node from one group can be connected just to a single node from the second group.

An ideal solution is demonstrated in the following picture:

enter image description here

The following picture demonstrates not an ideal solution, as the lengths of those connections differ too much.

Bad connection between the nodes

The following table shows the lengths between each node. From this table, I want to find the ideal solution. The solution is encircled.

Table of lengths between nodes.

And how can I find the best solution when I have groups with a hundred of nodes? If I try every combination, there will be 100! of combinations, and that's a lot. I am not able to come up with an algorithm with a reasonable complexity.

like image 405
Ladislav Ondris Avatar asked Jun 25 '16 15:06

Ladislav Ondris


2 Answers

As Thomas@ asked in the comments, the solution of the problem depends on your definition of "as close as possible". Let's define the set of edges in the solution to be S. Assuming the definition was chosen to be minimizing max(S) - min(S) (which is a reasonable assumption in practice), then this problem can definitely be solved in polynomial time.

Here's an example of the solution. For 100 nodes it will only take seconds.

1

First of all, you must know the maximum matching problem for bipartite graphs, and that it can be solved in polynomial time. The most straight-forward Hungarian Algorithm takes O(nm) in which n is the number of nodes and m is the number of edges. And for planar graph which is your case, there are more sophisticated algorithms that can further improve the performance.

Let's define this to be a function Max_Match(X), which returns the maximum number of matches in edge set X.

This Max_Match can be calculated in less than O(n^1.5), in which n is the number of nodes. For n=100, we have only n^1.5 = 1000.


2

Then let's convert your problem to the maximum matching problem.

Let's define the set of edges E to all the edges we can pick from. In your case there are n*n edges in E, where n is the number of nodes on one side.

Let's define function F(E, low, high) = {e | e in E and |e| >= low and |e| <= high } which means the subset edges of E whose lengths are between [low, high]

Then for a given pair of numbers low and high, your problem has a solution if and only if

Max_Match( F (E, low, high)) == n


3

However, how do we figure out the value of low and high ?

The possible values of low and high are every possible numbers in {|e|, e in E}, which has n^2 numbers. So trying all the possible combinations of low and high will cost n^4. That's a lot.

If we already know low, can we figure out high without enumeration? The answer is positive. Notice that :

lemma 1

If for a number h we have Max_Match( F (E, low, h)) == n

Then for every number h' >= h we also have

Max_Match( F (E, low, h')) == n

What this means is that we can use binary search to figure out high once we fix low.


4

So the framework of the final solution:

arr[] = {|e|, e in E}.sort()
for i in range(0, len(arr[])):
  low = i
  h_left = i, h_right = len(arr[])-1
  while h_left <= h_right:
    mid = (h_left+h_right)/2
    if Max_Match( F( E, arr[low], arr[mid]))==n:
      h_right = mid - 1
    else:
      h_left = mid + 1
  if h_left >= low:
    if arr[h_left] - arr[low] <= ans:
      ans = arr[h_left] - arr[low]

And ans will be the minimal difference between the edges in the solution. This algorithm costs O(n^3.5 * log(n)), in which n is the number of nodes.

Edit: a simple and ugly implementation of above algorithm in c++ can be found at: ideone.com/33n2Tg . Because I'm short of time, I handcrafted a simple Hungarian algorithm in this solution, which is slow when n is large. To achieve better performance you will need the algorithm I included in part 1 for planar graphs.

like image 58
lavin Avatar answered Nov 15 '22 03:11

lavin


I don't think you can easily calculate that with A*. Also a brute force would be O(n!). I think we should use heuristics like genetic algorithm.

You can implement the gen with an array (point A index is connected to the value point B)

enter image description here

Then you have to generate an gene pool with random genes. The number of genes is not fix... so you have to test a little bit around.

you will need a calculation method:

float GetLength(int indexA, int indexB)
{
    return sqrt((pointA[indexA].x - pointB[indexB].x) * (pointA[indexA].x - pointB[indexB].x) + (pointA[indexA].y - pointB[indexB].y) * (pointA[indexA].y - pointB[indexB].y));
}

then you also need a fitness function:

float Fitness(int gene, int count)
{
    float min = GetLength(0,gene[0]);
    float max = GetLength(0,gene[0]);
    for(int i = 0; i< count;i++)
    {
        min = std::min(min, GetLength(i,gene[i]));
        max = std::max(max, GetLength(i,gene[i]));
    }
    return max-min;
}

Also you need to pair them:

int* Crossbreed(int* gene1, int* gene2, int count)
{
    int* geneNew = new int[count];
    for(int i = 0; i< count; i++)
    {
        geneNew[gene1[i]] = gene2[i];
    }
    return geneNew;
} 

The Crossbreed gives us back some kids ;) which has to pass the fitness test. Also you should implement a mutate method, which gives us a mutated gene. This is important to leave local maximal.

At the end you only have to run this in a loop for some time and you will get a extremely good result.

like image 31
Thomas Avatar answered Nov 15 '22 04:11

Thomas