Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding the best matching pairwise points from 2 vectors

Tags:

r

matrix

distance

I have 2 lists with X,Y coordinates of points. List 1 contains more points than list 2.

The task is to find pairs of points in a way that the overall euclidean distance is minimized.

I have a working code, but i don't know if this is the best way and I would like to get hint what I can improve for result (better algorithm to find the minimum ) or speed, because the list are about 2000 elements each.

The round in the sample vectors is implemented to get also points with same distances. With the "rdist" function all distances are generated in "distances". Than the minimum in the matrix is used to link 2 point ("dist_min"). All distances of these 2 points are now replaced by NA and the loop continues by searching the next minimum until all points of list 2 have a point from list 1. At the end I have added a plot for visualization.

require(fields)

set.seed(1)
x1y1.data <- matrix(round(runif(200*2),2), ncol = 2)   # generate 1st set of points 
x2y2.data <- matrix(round(runif(100*2),2), ncol = 2)   # generate 2nd set of points

distances <- rdist(x1y1.data, x2y2.data)
dist_min <- matrix(data=NA,nrow=ncol(distances),ncol=7)   # prepare resulting vector with 7 columns

for(i in 1:ncol(distances)) 
{
    inds <- which(distances == min(distances,na.rm = TRUE), arr.ind=TRUE)

    dist_min[i,1] <- inds[1,1]              # row of point(use 1st element of inds if points have same distance)
    dist_min[i,2] <- inds[1,2]              # column of point (use 1st element of inds if points have same distance)
    dist_min[i,3] <- distances[inds[1,1],inds[1,2]] # distance of point
    dist_min[i,4] <- x1y1.data[inds[1,1],1]     # X1 ccordinate of 1st point
    dist_min[i,5] <- x1y1.data[inds[1,1],2]     # Y1 coordinate of 1st point
    dist_min[i,6] <- x2y2.data[inds[1,2],1]     # X2 coordinate of 2nd point
    dist_min[i,7] <- x2y2.data[inds[1,2],2]     # Y2 coordinate of 2nd point

    distances[inds[1,1],] <- NA # remove row (fill with NA), where minimum was found
    distances[,inds[1,2]] <- NA # remove column (fill with NA), where minimum was found
}

# plot 1st set of points
# print mean distance as measure for optimization
plot(x1y1.data,col="blue",main="mean of min_distances",sub=mean(dist_min[,3],na.rm=TRUE))       
points(x2y2.data,col="red")                         # plot 2nd set of points
segments(dist_min[,4],dist_min[,5],dist_min[,6],dist_min[,7])   # connect pairwise according found minimal distance

output with min distance

like image 567
user1716533 Avatar asked Dec 19 '12 21:12

user1716533


1 Answers

This is a fundamental problem in combinatorial optimization known as the assignment problem. One approach to solving the assignment problem is the Hungarian algorithm which is implemented in the R package clue:

require(clue)
sol <- solve_LSAP(t(distances))

We can verify that it outperforms the naive solution:

mean(dist_min[,3])
# [1] 0.05696033
mean(sqrt(
  (x2y2.data[,1] - x1y1.data[sol, 1])^2 +  
    (x2y2.data[,2] - x1y1.data[sol, 2])^2))
#[1] 0.05194625

And we can construct a similar plot to the one in your question:

plot(x1y1.data,col="blue")       
points(x2y2.data,col="red")
segments(x2y2.data[,1], x2y2.data[,2], x1y1.data[sol, 1], x1y1.data[sol, 2])

enter image description here

like image 53
orizon Avatar answered Oct 12 '22 23:10

orizon