Let's say I have two arrays of the same length <code>n</code>, named <code>A</code> and <code>B</code>. These two arrays contain real values. We define the distance between two arrays as the mean squared distance. <code> dist(A,B) = sqrt( sum((A - B)2) ) </code> I want to find the permutation of <code>A</code> that gives the min distance to <code>B</code>. The naive method is to try every permutation of <code>A</code> and record the min distance. However, this method is of complexity O(n!). Is there an algorithm of complexity less than O(n!)?

The problem you describe is equivalent to the Minimum Cost Perfect Matching Problem which can be solved efficiently (and exactly) using The Hungarian Algorithm. In the Minimum Cost Perfect Matching Problem you have an input weighted bipartite graph where the two sets have the same size n, and each edge has a non-negative cost. The goal is to find a perfect matching of minimum cost. In your case, the bipartite graph is a biclique. That is, every vertex in one set is connected to every vertex in the other set, and the cost of edge <code>(i,j)</code> is <code>(A[i] - B[i])^2</code> (where <code>i</code> corresponds to index <code>i</code> in array A and <code>j</code> corresponds to index <code>j</code> in array B). EDIT: This is not the best solution for the problem. Ivo Merchiers came up with a better solution both in terms of efficiency and simplicity. The reason I'm not deleting my answer is because my suggested solution is valuable for distance measures to which Ivo's solution does not apply (as his approach works by exploiting a property of the Euclidean distance).

You can just sort A and B and match up the corresponding elements. Imagine that there are two elements of A, Ai and Aj, corresponding to Bi and Bj. The error contribution from these matches is: (Ai-Bi)^2 + (Aj-Bj)^2 = Ai^2 + Bi^2 + Aj^2 + Bj^2 - 2(AiBi + AjBj) Is it better to swap the matches, or to keep them the way they are? Well, the difference in the error if we swap them is: 2(AiBi + AjBj) - 2(AiBj + AjBi) ~ AiBi - AiBj + AjBj - AjBi = Ai(Bi - Bj) - Aj(Bi - Bj) = (Ai - Aj)(Bi - Bj) So, if the As and Bs are in the same order, then this product is positive and the error will go up if you swap them. If they are not in the same order, then this product is negative an the error will go down if you swap them. If you repeatedly swap any pairs that are out of order until there are no such pairs, your error will keep going down, and you'll end up with with the nth largest A matched up with the nth largest B all the way through the array. Just sorting them and matching them up is therefor optimal, and of course it's way faster than the Hungarian algorithm.

Construct a bipartite graph from the vectors. Find minimum weight perfect matching in this graph. How to construct the graph. <ol> <li>Let <code>A</code>, <code>B</code> be two parts of the graph. Each with <code>n</code> nodes.</li> <li>Connect <code>i</code> in <code>A</code> to <code>j</code> in <code>B</code> with an edge of weight <code>abs(A[i] - B[j])</code>. </li> </ol> I believe this can be done in <code>O(n^2)</code>. See http://www.cse.iitd.ernet.in/~naveen/courses/CSL851/lec4.pdf <hr> If each number in <code>A</code> has only one closest number in <code>B</code> then you can do this in <code>O(n \log n)</code>. This probably might be the case since you have the real numbers. How? <ol> <li>Sort A <code>O(n \log n)</code> </li> <li>Binary search for the closest number for each number in B. <code>O(n \log n)</code>. </li> </ol> If the numbers are coming from the real world and have even a little bit of randomness to them, then the differences between each pair of the numbers is probably unique. You can verify if this is the case by running experiments on the input vectors. Then the problem becomes easy to solve yay!

Given two arrays, find the permutations that give closest distance between two arrays

Q: What are used for finding minimum distance between two places?

Question 4: What is the formula for distance between two points? Answer: The formula of distance between two points is P(x1, y1) and Q(x2, y2) is given by: d (P, Q) = √ (x2 – x1) + (y2 – y1) 2. While the distance of a point is P(x, y) from the origin is given by d(0, P) = √ x2 + y2.

4 Answers

You can just sort both A and B. In that case, the Euclidean distance is minimal.

If B has to remain fixed, then you just need to invert the permutation needed to sort B and apply that to the sorted version of A.

This solution does assume that you want to find just a permutation and not the most simple permutation (since sorting and unsorting through permutations will not be incredibly efficient).

Proof: Let S,T be our pair of arrays. We can assume S to be sorted without loss of generality, since all that matters is the mapping between the two sets of elements.

Let T be the permutation that minimizes the distance between the two arrays, and let d be that distance.

Suppose that T is not sorted. Then there exist elements i,j s.t. T_i > T_j

S_i + k1 = S_j
T_i = T_j + k2
where k1,k2 > 0

Let x be the total distance of all elements except i and j.

d = x + (S_i - T_i)^2 + ((S_i + k1) - (T_i - k2))^2

If we swap the order of T_i and T_j, then our new distance is:

d' = x + (S_i - (T_i - k2))^2 + ((S_i + k1) - T_i)^2

Thus: d - d' = 2 * k1 * k2, which contradicts our assumption that T is the permutation that minimizes the distance, so the permutation that does so must be sorted.

Sorting the two arrays can be done in O(n log n) using a variety of methods.

195

answered Oct 13 '22 05:10

Ivo Merchiers

The problem you describe is equivalent to the Minimum Cost Perfect Matching Problem which can be solved efficiently (and exactly) using The Hungarian Algorithm. In the Minimum Cost Perfect Matching Problem you have an input weighted bipartite graph where the two sets have the same size n, and each edge has a non-negative cost. The goal is to find a perfect matching of minimum cost.

In your case, the bipartite graph is a biclique. That is, every vertex in one set is connected to every vertex in the other set, and the cost of edge (i,j) is (A[i] - B[i])^2 (where i corresponds to index i in array A and j corresponds to index j in array B).

EDIT: This is not the best solution for the problem. Ivo Merchiers came up with a better solution both in terms of efficiency and simplicity. The reason I'm not deleting my answer is because my suggested solution is valuable for distance measures to which Ivo's solution does not apply (as his approach works by exploiting a property of the Euclidean distance).

answered Oct 13 '22 05:10

snakile

You can just sort A and B and match up the corresponding elements.

Imagine that there are two elements of A, Ai and Aj, corresponding to Bi and Bj.

The error contribution from these matches is:

(Ai-Bi)^2 + (Aj-Bj)^2

= Ai^2 + Bi^2 + Aj^2 + Bj^2 - 2(AiBi + AjBj)

Is it better to swap the matches, or to keep them the way they are?

Well, the difference in the error if we swap them is:

2(AiBi + AjBj) - 2(AiBj + AjBi)

~ AiBi - AiBj + AjBj - AjBi

= Ai(Bi - Bj) - Aj(Bi - Bj)

= (Ai - Aj)(Bi - Bj)

So, if the As and Bs are in the same order, then this product is positive and the error will go up if you swap them. If they are not in the same order, then this product is negative an the error will go down if you swap them.

If you repeatedly swap any pairs that are out of order until there are no such pairs, your error will keep going down, and you'll end up with with the nth largest A matched up with the nth largest B all the way through the array.

Just sorting them and matching them up is therefor optimal, and of course it's way faster than the Hungarian algorithm.

answered Oct 13 '22 04:10

Matt Timmermans

Construct a bipartite graph from the vectors. Find minimum weight perfect matching in this graph.

How to construct the graph.

Let A, B be two parts of the graph. Each with n nodes.
Connect i in A to j in B with an edge of weight abs(A[i] - B[j]).

I believe this can be done in O(n^2).

See http://www.cse.iitd.ernet.in/~naveen/courses/CSL851/lec4.pdf

If each number in A has only one closest number in B then you can do this in O(n \log n). This probably might be the case since you have the real numbers.

How?

Sort A O(n \log n)
Binary search for the closest number for each number in B. O(n \log n).

If the numbers are coming from the real world and have even a little bit of randomness to them, then the differences between each pair of the numbers is probably unique. You can verify if this is the case by running experiments on the input vectors. Then the problem becomes easy to solve yay!

answered Oct 13 '22 03:10

Pratik Deoghare

Related questions
                            
                                Sum of digits of a factorial
                            
                                How does this work? Weird Towers of Hanoi Solution
                            
                                Weighted random selection with and without replacement
                            
                                Efficiently querying one string against multiple regexes
                            
                                How to read values from numbers written as words?
                            
                                When is doubly linked list more efficient than singly linked list?
                            
                                Largest circle inside a non-convex polygon
                            
                                What is the difference between dynamic programming and greedy approach?
                            
                                Interview question - Search in sorted array X for index i such that X[i] = i
                            
                                How to intelligently degrade or smooth GIS data (simplifying polygons)?
                            
                                Are there any better methods to do permutation of string?
                            
                                What is a bubble sort good for? [closed]
                            
                                What is the difference between Linear search and Binary search?
                            
                                Good algorithm for finding the diameter of a (sparse) graph?
                            
                                How do I find the median of numbers in linear time using heaps?
                            
                                Efficient way to search a stream for a string
                            
                                Algorithm challenge: Generate color scheme from an image
                            
                                Why does Python's itertools.permutations contain duplicates? (When the original list has duplicates)
                            
                                Mastering Recursive Programming [closed]
                            
                                Difference between AVL trees and splay trees

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Given two arrays, find the permutations that give closest distance between two arrays

Tags:

algorithm

Wang Duo

People also ask