Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to match pairs from two sets

Tags:

algorithm

set

I have two sets, each set is a listing of a pair of numbers

Set1 =[(x1, y1), (x2, y2), ..., (xN, yN)]
Set2 =[(a1, b1), (a2, b2), ..., (aN, bN)]

If plotted on an XY plane, Set1 and Set2 have the same basic shape, however the data points of set2 are a rotated/translated/scaled/noisy/skewed version of set1. The ordering of the pairs within each set is random. Is there an efficient way to determine which points in set1 correspond to their counterparts in set2?

like image 311
rossb83 Avatar asked Jan 10 '13 20:01

rossb83


People also ask

What is the algorithm for matching?

The matching algorithm can be summarized as follows. A block of records is read from data source or in a Reference Match from both a data source and reference source. All columns are compared and a composite weight is computed for each possible record pair in the block. A matrix of composite weights is created.

What is best matching algorithm?

BM25 [5] is the Best Matching ranking algorithm that ranks the documents according to its relevance to the given search query.


2 Answers

You are looking for a family of algorithms that try to minimize the difference between two point clouds. This is a fairly hard problem to solve and there can be multiple solutions (for example, if you were given two cubes, there are many possible rotations that work).

One particularly popular approach is the ICP (iterative closest point) algorithm, which starts with a candidate guess and continuously refines it until some correctness criterion is reached or time expires. It might be a good starting point to check out.

Hope this helps!

like image 143
templatetypedef Avatar answered Oct 30 '22 05:10

templatetypedef


Yes, assuming only Rotations, Scalings and Translations this can be done (except for the "noise" and "skew" part, that I'm not sure about).

One approach:

  1. Determine the Centroid (2D mean) of each set.
  2. Determine the Least-Squares (2D) slope of each set.
  3. Determine the 2D Variance of each set.
  4. Remap the first set, using the Centroid difference to Translate, the Slope differences to Rotate(*) and the Variance differences to Scale, so that both sets now have the same Centroid, Slope and Variance.
  5. Sort both sets and then compare the points for equality/similarity. (Alternately, you can do an RSM (root-mean-square) difference between them as a measure of "noise"/difference).

(*-Note that Reflections and/or symmetries can cause problems with the Rotation part.)

like image 44
RBarryYoung Avatar answered Oct 30 '22 07:10

RBarryYoung