I have two sets, each set is a listing of a pair of numbers
Set1 =[(x1, y1), (x2, y2), ..., (xN, yN)]
Set2 =[(a1, b1), (a2, b2), ..., (aN, bN)]
If plotted on an XY plane, Set1 and Set2 have the same basic shape, however the data points of set2 are a rotated/translated/scaled/noisy/skewed version of set1. The ordering of the pairs within each set is random. Is there an efficient way to determine which points in set1 correspond to their counterparts in set2?
The matching algorithm can be summarized as follows. A block of records is read from data source or in a Reference Match from both a data source and reference source. All columns are compared and a composite weight is computed for each possible record pair in the block. A matrix of composite weights is created.
BM25 [5] is the Best Matching ranking algorithm that ranks the documents according to its relevance to the given search query.
You are looking for a family of algorithms that try to minimize the difference between two point clouds. This is a fairly hard problem to solve and there can be multiple solutions (for example, if you were given two cubes, there are many possible rotations that work).
One particularly popular approach is the ICP (iterative closest point) algorithm, which starts with a candidate guess and continuously refines it until some correctness criterion is reached or time expires. It might be a good starting point to check out.
Hope this helps!
Yes, assuming only Rotations, Scalings and Translations this can be done (except for the "noise" and "skew" part, that I'm not sure about).
One approach:
(*-Note that Reflections and/or symmetries can cause problems with the Rotation part.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With