Match set of x,y points to another set that is scaled, rotated, translated, and with missing elements

Q: How do I find new coordinates after rotation?

When a point (x,y) is rotated about the origin (0,0) counterclockwise by an angle θ, the coordinates of the new point (x′,y′) are x′=xcos(θ)−ysin(θ),y′=xsin(θ)+ycos(θ).

Q: How do you change coordinate axis?

The procedure is simple: start at O, then walk px squares in the new X direction, and finally walk py squares in the new Y direction. Hence, we say that (px,py) are the coordinates of the position P, relative to origin O and axes X and Y.

Tags:

performance

python

geometry

astronomy

affinetransform

(Why am I doing this? See explanation below)

Consider two sets of points, A and B as shown below

enter image description here

It might not look like it, but set A is "hidden" within set B. It can not be easily seen because points in B are scaled, rotated, and translated in (x, y) with respect to A. Even worse, some points that are present in A are missing in B, and B contains many points that are not in A.

I need to find the appropriate scaling, rotation, and translation that must be applied to the B set in order to match it with set A. In the case shown above, the correct values are:

scale = 0.14, rot_angle = 0.0, x_transl = 35.0, y_transl = 2.0

which produces the (good enough) match

enter image description here

(in red, only the matched B points are shown; these are located in the sector 1000<x<2000, y~2000 in the first figure to the right). Given the many degrees of freedom (DoF: scaling + rotation + 2D translation) I am aware of the possibility of a miss-match, but the coordinates of the points are not random (although they may look like it) so the probability of this happening is very small.

The code I wrote (see below) uses brute force to loop through all possible DoF values taken from pre-defined ranges for each one. The core of the code is based on minimizing the distance of each point in A to any point in B

The code works (it actually generated the solution mentioned above), but since the number of solutions (i.e., the combinations of accepted values for each DoF) scales with larger ranges, it can become unacceptably slow pretty quick (also it eats up all the RAM in my system)

How can I improve the performance of the code? I'm open to any solution including numpy and/or scipy. Perhaps something like Basing-Hopping to search for the best match (or a relatively close one) instead of the brute force method I currently employ?

import numpy as np
from scipy.spatial import distance
import math


def scalePoints(B_center, delta_x, delta_y, scale):
    """
    Scales xy points.

    http://codereview.stackexchange.com/q/159183/35351
    """
    x_scale = B_center[0] - scale * delta_x
    y_scale = B_center[1] - scale * delta_y

    return x_scale, y_scale


def rotatePoints(center, x, y, angle):
    """
    Rotates points in 'xy' around 'center'. Angle is in degrees.
    Rotation is counter-clockwise

    http://stackoverflow.com/a/20024348/1391441
    """
    angle = math.radians(angle)
    xy_rot = x - center[0], y - center[1]
    xy_rot = (xy_rot[0] * math.cos(angle) - xy_rot[1] * math.sin(angle),
              xy_rot[0] * math.sin(angle) + xy_rot[1] * math.cos(angle))
    xy_rot = xy_rot[0] + center[0], xy_rot[1] + center[1]

    return xy_rot


def distancePoints(set_A, x_transl, y_transl):
    """
    Find the sum of the minimum distance of points in set_A to points in set_B.
    """
    d = distance.cdist(set_A, zip(*[x_transl, y_transl]), 'euclidean')
    # Sum of all minimal distances.
    d_sum = np.sum(np.min(d, axis=1))

    return d_sum


def match_frames(
        set_A, B_center, delta_x, delta_y, tol, sc_min, sc_max, sc_step,
        ang_min, ang_max, ang_step, xmin, xmax, xstep, ymin, ymax, ystep):
    """
    Process all possible solutions in the defined ranges.
    """
    N = len(set_A)
    # Ranges
    sc_range = np.arange(sc_min, sc_max, sc_step)
    ang_range = np.arange(ang_min, ang_max, ang_step)
    x_range = np.arange(xmin, xmax, xstep)
    y_range = np.arange(ymin, ymax, ystep)
    print("Total solutions: {:.2e}".format(
          np.prod([len(_) for _ in [sc_range, ang_range, x_range, y_range]])))

    d_sum, params_all = [], []
    for scale in sc_range:
        # Scaled points.
        x_scale, y_scale = scalePoints(B_center, delta_x, delta_y, scale)
        for ang in ang_range:
            # Rotated points.
            xy_rot = rotatePoints(B_center, x_scale, y_scale, ang)
            # x translation
            for x_tr in x_range:
                x_transl = xy_rot[0] + x_tr
                # y translation
                for y_tr in y_range:
                    y_transl = xy_rot[1] + y_tr

                    # Find minimum distance sum.
                    d_sum.append(distancePoints(set_A, x_transl, y_transl))

                    # Store solutions.
                    params_all.append([scale, ang, x_tr, y_tr])

                    # Condition to break out if given tolerance for match
                    # is achieved.
                    if d_sum[-1] <= tol * N:
                        print("Match found:", scale, ang, x_tr, y_tr)
                        i_min = d_sum.index(min(d_sum))
                        return i_min, params_all

        # Print best solution found so far.
        i_min = d_sum.index(min(d_sum))
        print("d_sum_min = {:.2f}".format(d_sum[i_min]))

    return i_min, params_all


# Data.
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
         [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [
    [2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
     620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
    [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
     3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]

# This is necessary to apply the scaling.
x, y = np.asarray(set_B)
# Center of B points, defined as the center of the minimal rectangle that
# contains all points.
B_center = [(min(x) + max(x)) * .5, (min(y) + max(y)) * .5]
# Difference between the center coordinates and the xy points.
delta_x, delta_y = B_center[0] - x, B_center[1] - y

# Tolerance in pixels for match.
tol = 1.
# Ranges for each DoF.
sc_min, sc_max, sc_step = .01, .2, .01
ang_min, ang_max, ang_step = -30., 30., 1.
xmin, xmax, xstep = -150., 150., 1.
ymin, ymax, ystep = -150., 150., 1.

# Find proper scaling + rotation + translation for set_B.
i_min, params_all = match_frames(
    set_A, B_center, delta_x, delta_y, tol, sc_min, sc_max, sc_step,
    ang_min, ang_max, ang_step, xmin, xmax, xstep, ymin, ymax, ystep)

# Best match found
print(params_all[i_min])

Why am I doing this?

When an astronomer observes a star field, it also needs to observe what is called a "standard field of stars". This is needed to be able to transform the "instrumental magnitudes" (a logarithmic measure of the brightness) of the observed stars to a common universal scale, since these magnitudes depend on the optical system used (telescope + CCD array). In the example shown here, the standard field can be seen below to the left, and the observed one to the right.

enter image description here

Notice that the points in the set A (used above) are the marked stars in the standard field, and the set B are those stars detected in the observed field (marked in red above)

The process of identifying those stars in the observed field that correspond to the stars marked in the standard field is done by-eye, even today. This is due to the complexity of the task.

In the observed image above, there's quite a bit of scaling, but no rotation and little translation. This constitutes a rather favorable scenario; it could be much worse. I'm trying to develop a simple algorithm to avoid having to manually identify stars in the observed field as stars in the standard field, one by one.

Solution proposed by litepresence

This is a script I made following the answer by litepresence.

import itertools
import numpy as np
import matplotlib.pyplot as plt


def getTriangles(set_X, X_combs):
    """
    Inefficient way of obtaining the lengths of each triangle's side.
    Normalized so that the minimum length is 1.
    """
    triang = []
    for p0, p1, p2 in X_combs:
        d1 = np.sqrt((set_X[p0][0] - set_X[p1][0]) ** 2 +
                     (set_X[p0][1] - set_X[p1][1]) ** 2)
        d2 = np.sqrt((set_X[p0][0] - set_X[p2][0]) ** 2 +
                     (set_X[p0][1] - set_X[p2][1]) ** 2)
        d3 = np.sqrt((set_X[p1][0] - set_X[p2][0]) ** 2 +
                     (set_X[p1][1] - set_X[p2][1]) ** 2)
        d_min = min(d1, d2, d3)
        d_unsort = [d1 / d_min, d2 / d_min, d3 / d_min]
        triang.append(sorted(d_unsort))

    return triang


def sumTriangles(A_triang, B_triang):
    """
    For each normalized triangle in A, compare with each normalized triangle
    in B. find the differences between their sides, sum their absolute values,
    and select the two triangles with the smallest sum of absolute differences.
    """
    tr_sum, tr_idx = [], []
    for i, A_tr in enumerate(A_triang):
        for j, B_tr in enumerate(B_triang):
            # Absolute value of lengths differences.
            tr_diff = abs(np.array(A_tr) - np.array(B_tr))
            # Sum the differences
            tr_sum.append(sum(tr_diff))
            tr_idx.append([i, j])

    # Index of the triangles in A and B with the smallest sum of absolute
    # length differences.
    tr_idx_min = tr_idx[tr_sum.index(min(tr_sum))]
    A_idx, B_idx = tr_idx_min[0], tr_idx_min[1]
    print("Smallest difference: {}".format(min(tr_sum)))

    return A_idx, B_idx


# Data.
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
         [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [
    [2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
     620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
    [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
     3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]
set_B = zip(*set_B)

# All possible triangles.
A_combs = list(itertools.combinations(range(len(set_A)), 3))
B_combs = list(itertools.combinations(range(len(set_B)), 3))

# Obtain normalized triangles.
A_triang, B_triang = getTriangles(set_A, A_combs), getTriangles(set_B, B_combs)

# Index of the A and B triangles with the smallest difference.
A_idx, B_idx = sumTriangles(A_triang, B_triang)

# Indexes of points in A and B of the best match triangles.
A_idx_pts, B_idx_pts = A_combs[A_idx], B_combs[B_idx]
print 'triangle A %s matches triangle B %s' % (A_idx_pts, B_idx_pts)

# Matched points in A and B.
print "A:", [set_A[_] for _ in A_idx_pts]
print "B:", [set_B[_] for _ in B_idx_pts]

# Plot
A_pts = zip(*[set_A[_] for _ in A_idx_pts])
B_pts = zip(*[set_B[_] for _ in B_idx_pts])
plt.scatter(*A_pts, s=10, c='k')
plt.scatter(*B_pts, s=10, c='r')
plt.show()

The method is almost instantaneous and produces the correct match:

Smallest difference: 0.0314154749597
triangle A (0, 1, 4) matches triangle B (3, 4, 10)
A: [[2015.81, 1981.665], [1967.31, 1960.865], [1894.41, 1957.065]]
B: [(1051.3, 1837.3), (1929.49, 2017.52), (1580.77, 1868.33)]

enter image description here

289

asked Mar 30 '17 19:03

Gabriel

1 Answers

1) I would approach this problem by labeling all points and finding all possible combinations of 3 points from each set.

# normalize B data to same format as A
set_Bx, set_By = (set_B)
set_B = []
for i in range(len(set_Bx)):
    set_B.append([set_Bx[i],set_By[i]])
'''
set_B = [[2689.28, 2061.19], [3507.04, 3700.27], [2895.67, 2131.2], 
[1051.3, 1837.3], [1929.49, 2017.52], [1035.97, 80.96], [752.44, 
3524.61], [130.62, 3821.22], [620.06, 3711.53], [2769.06, 1812.12], 
[1580.77, 1868.33], [281.76, 3865.77], [224.54, 3273.77], [3848.3, 
2100.71]]
'''

list(itertools.combinations(range(len(set_A)), 3))
list(itertools.combinations(range(len(set_B)), 3))

How to generate all permutations of a list in Python

2) For each 3 point group, calculate the sides of the respective triangle; repeating the process for group A and group B.

dist = sqrt( (x2 - x1)**2 + (y2 - y1)**2 )

How do I find the distance between two points?

3) Then reduce the ratio of sides for each such that each triangle's smallest side was then equal to 1; the other sides reduced in respective ratio.

In two similar triangles:

The perimeters of the two triangles are in the same ratio as the sides. The corresponding sides, medians and altitudes will all be in this same ratio.

http://www.mathopenref.com/similartrianglesparts.html

4) Finally for each a triangle from group A, compare to each triangle from group B, with element-wise subtraction. Then sum the resulting elements and find the triangles from A and B with the smallest sum.

list(numpy.array(list1)-numpy.array(list2))

Subtracting 2 lists in Python

5) Given matched triangles; finding the appropriate scaling, translation, and rotation should be relatively trivial in terms of CPU/RAM. enter image description here

ETA1: rough draft script

ETA2: patched error discussed in comments: with sum(abs()) instead of abs(sum()). Now it works, fast too!

'''
known correct solution

A = [[1894.41, 1957.065],[1967.31, 1960.865],[2015.81, 1981.665]]
B = [[1051.3, 1837.3],[1580.77, 1868.33],[1929.49, 2017.52]]

'''
import numpy as np
import itertools
import math
import operator

set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
        [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [[2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
        620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
        [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
        3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]

# normalize set B data to set A format
set_Bx, set_By = (set_B)
set_B = []
for i in range(len(set_Bx)):
    set_B.append([set_Bx[i],set_By[i]])

'''
set_B = [[2689.28, 2061.19], [3507.04, 3700.27], [2895.67, 2131.2], 
[1051.3, 1837.3], [1929.49, 2017.52], [1035.97, 80.96], [752.44, 3524.61], 
[130.62, 3821.22], [620.06, 3711.53], [2769.06, 1812.12], [1580.77, 1868.33], 
[281.76, 3865.77], [224.54, 3273.77], [3848.3, 2100.71]]
'''

print set_A
print set_B
print len(set_A)
print len(set_B)

set_A_tri = list(itertools.combinations(range(len(set_A)), 3))
set_B_tri = list(itertools.combinations(range(len(set_B)), 3))

print set_A_tri
print set_B_tri
print len(set_A_tri)
print len(set_B_tri)

'''
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365], 
[1964.91, 1994.565], [1894.41, 1957.065]]

set_A_tri = [(0, 1, 2), (0, 1, 3), (0, 1, 4), (0, 2, 3), (0, 2, 4), (0, 3, 4), 
(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]
'''

def distance(x1,y1,x2,y2):
    return math.sqrt((x2 - x1)**2 + (y2 - y1)**2 )

def tri_sides(set_x, set_x_tri):

    triangles = []
    for i in range(len(set_x_tri)):

        point1 = set_x_tri[i][0]
        point2 = set_x_tri[i][1]
        point3 = set_x_tri[i][2]

        point1x, point1y = set_x[point1][0], set_x[point1][1]
        point2x, point2y = set_x[point2][0], set_x[point2][1]
        point3x, point3y = set_x[point3][0], set_x[point3][1] 

        len1 = distance(point1x,point1y,point2x,point2y)
        len2 = distance(point1x,point1y,point3x,point3y)
        len3 = distance(point2x,point2y,point3x,point3y)

        min_side = min(len1,len2,len3)
        len1/=min_side
        len2/=min_side
        len3/=min_side
        t=[len1,len2,len3]
        t.sort()
        triangles.append(t)

    return triangles

A_triangles = tri_sides(set_A, set_A_tri)
B_triangles = tri_sides(set_B, set_B_tri)

print A_triangles
'''
[[1.0, 5.0405616860744304, 5.822935502560814], 
[1.0, 1.5542012854321234, 1.5619803879976761], 
[1.0, 1.3832883678507584, 2.347214708755337], 
[1.0, 1.2141910838179129, 1.4096730529373076], 
[1.0, 1.1275138587537166, 2.0318412465223665], 
[1.0, 1.5207417600732074, 2.3589630093994876], 
[1.0, 3.2270326342163584, 4.13069930678442], 
[1.0, 6.565440477766354, 6.972550347780966], 
[1.0, 2.1606693015281997, 2.3635387983160885], 
[1.0, 1.589425903498476, 1.846471085870448]]
'''
print B_triangles

def list_subtract(list1,list2):

    return np.absolute(np.array(list1)-np.array(list2))

sums = []
threshold = 1
for i in range(len(A_triangles)):
    for j in range(len(B_triangles)):
        k = sum(list_subtract(A_triangles[i], B_triangles[j]))
        if k < threshold:
            sums.append([i,j,k])
# sort by smallest sum
sums = sorted(sums, key=operator.itemgetter(2))

print sums
print 'winner %s' % sums[0]
print sums[0][0]
print sums[0][1]
match_A = set_A_tri[sums[0][0]]
match_B = set_B_tri[sums[0][1]]
print 'triangle A %s matches triangle B %s' % (match_A, match_B)

match_A_pts = []
match_B_pts = []
for i in range(3):
    match_A_pts.append(set_A[match_A[i]])
    match_B_pts.append(set_B[match_B[i]])

print 'triangle A has points %s' % match_A_pts
print 'triangle B has points %s' % match_B_pts

'''
winner [2, 204, 0.031415474959738399]
2
204
triangle A (0, 1, 4) matches triangle B (3, 4, 10)
triangle A has points [[2015.81, 1981.665], [1967.31, 1960.865], [1894.41, 1957.065]]
triangle B has points [[1051.3, 1837.3], [1929.49, 2017.52], [1580.77, 1868.33]]
'''

139

answered Oct 18 '22 09:10

litepresence

Related questions
                            
                                User input with a timeout, in a loop
                            
                                Wrong pip in conda env
                            
                                argparse - Build back command line
                            
                                How to change space between bars when drawing multiple barplots in pandas?
                            
                                Unit testing elastic search inside Django app
                            
                                Spark program gives odd results when ran on standalone cluster
                            
                                Troubleshooting Websockets with EC2 on AWS using Django
                            
                                Generate all possible outcomes of k balls in n bins (sum of multinomial / categorical outcomes)
                            
                                Tensorflow implementation of word2vec
                            
                                Individual timeouts for concurrent.futures
                            
                                Bulk upsert with SQLAlchemy [duplicate]
                            
                                How to insert arbitrary JSON in HTML's script tag
                            
                                File not uploading with Flask-wtforms in cookiecutter-flask app
                            
                                How to maintain different country versions of same language in Django?
                            
                                Stuck implementing simple neural network
                            
                                MongoDB: Can't initiate replica set; 'has data already, cannot initiate set'
                            
                                Vectorizing the Kinect real-world coordinate processing algorithm for speed
                            
                                Using Django URLs with AngularJs routeProvider
                            
                                pip not installing entry_points as executables
                            
                                Order-invariant hash in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Match set of x,y points to another set that is scaled, rotated, translated, and with missing elements

Tags:

performance

python

geometry

astronomy

affinetransform

Gabriel

People also ask

1 Answers

litepresence

Recent Activity

Donate For Us