Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

For each point in one data set, calculate distance to nearest point in second data set

Trying to find, for each point in a SpatialPointsDataFrame, the distance to the closest point in a second SpatialPointsDataFrame (equivalent to the "nearest" tool in ArcGIS for two SpatialPointDataFrames).

I can do the naive implementation by calculating all pairwise distances using gDistance and taking the min (like answer 1 here), but I have some huge datasets and was looking for something more efficient.

For example, here's a trick with knearneigh for points in same dataset.

Cross-posted on r-sig-geo

like image 627
nick_eu Avatar asked May 19 '16 20:05

nick_eu


1 Answers

The SearchTrees package offers one solution. Quoting from its documentation, it, "provides an implementation of the QuadTree data structure [which it] uses to implement fast k-Nearest Neighbor [...] lookups in two dimensions."

Here's how you could use it to quickly find, for each point in a SpatialPoints object b, the two nearest points in a second SpatialPoints object B

library(sp)
library(SearchTrees)

## Example data
set.seed(1)
A <- SpatialPoints(cbind(x=rnorm(100), y=rnorm(100)))
B <- SpatialPoints(cbind(x=c(-1, 0, 1), y=c(1, 0, -1)))

## Find indices of the two nearest points in A to each of the points in B
tree <- createTree(coordinates(A))
inds <- knnLookup(tree, newdat=coordinates(B), k=2)

## Show that it worked
plot(A, pch=1, cex=1.2)
points(B, col=c("blue", "red", "green"), pch=17, cex=1.5)
## Plot two nearest neigbors
points(A[inds[1,],], pch=16, col=adjustcolor("blue", alpha=0.7))
points(A[inds[2,],], pch=16, col=adjustcolor("red", alpha=0.7))
points(A[inds[3,],], pch=16, col=adjustcolor("green", alpha=0.7))

enter image description here

like image 188
Josh O'Brien Avatar answered Oct 14 '22 22:10

Josh O'Brien