Let this coordinates class with the Euclidean distance,
case class coord(x: Double, y: Double) {
def dist(c: coord) = Math.sqrt( Math.pow(x-c.x, 2) + Math.pow(y-c.y, 2) )
}
and let a grid of coordinates, for instance
val grid = (1 to 25).map {_ => coord(Math.random*5, Math.random*5) }
Then for any given coordinate
val x = coord(Math.random*5, Math.random*5)
the nearest points to x
are
val nearest = grid.sortWith( (p,q) => p.dist(x) < q.dist(x) )
so the first three closest are nearest.take(3)
.
Is there a way to make these calculations more time efficient especially for the case of a grid with one million points ?
The average nearest neighbor ratio is calculated as the observed average distance divided by the expected average distance (with expected average distance being based on a hypothetical random distribution with the same number of features covering the same total area).
This is called k-nearest neighbor (KNN) search or similarity search and has all kinds of useful applications. Examples here are model-free classification, pattern recognition, collaborative filtering for recommendation, and data compression, to name but a few.
For body centered cubic lattice nearest neighbour distance is half of the body diagonal distance, a√3/2. Threfore there are eight nearest neighnbours for any given lattice point. For face centred cubic lattice nearest neighbour distance is half of the face diagonal distance, a√2/2.
I'm not sure if this is helpful (or even stupid), but I thought of this:
You use a sort-function to sort ALL elements in the grid and then pick the first k
elements. If you consider a sorting algorithm like recursive merge-sort, you have something like this:
Maybe you could optimize such a function for your needs. The merging part normally merges all elements from both halves, but you are only interested in the first k
that result from the merging. So you could only merge until you have k
elements and ignore the rest.
So in the worst-case, where k >= n
(n
is the size of the grid) you would still only have the complexity of merge-sort. O(n log n)
To be honest I'm not able to determine the complexity of this solution relative to k
. (too tired for that at the moment)
Here is an example implementation of that solution (it's definitely not optimal and not generalized):
def minK(seq: IndexedSeq[coord], x: coord, k: Int) = {
val dist = (c: coord) => c.dist(x)
def sort(seq: IndexedSeq[coord]): IndexedSeq[coord] = seq.size match {
case 0 | 1 => seq
case size => {
val (left, right) = seq.splitAt(size / 2)
merge(sort(left), sort(right))
}
}
def merge(left: IndexedSeq[coord], right: IndexedSeq[coord]) = {
val leftF = left.lift
val rightF = right.lift
val builder = IndexedSeq.newBuilder[coord]
@tailrec
def loop(leftIndex: Int = 0, rightIndex: Int = 0): Unit = {
if (leftIndex + rightIndex < k) {
(leftF(leftIndex), rightF(rightIndex)) match {
case (Some(leftCoord), Some(rightCoord)) => {
if (dist(leftCoord) < dist(rightCoord)) {
builder += leftCoord
loop(leftIndex + 1, rightIndex)
} else {
builder += rightCoord
loop(leftIndex, rightIndex + 1)
}
}
case (Some(leftCoord), None) => {
builder += leftCoord
loop(leftIndex + 1, rightIndex)
}
case (None, Some(rightCoord)) => {
builder += rightCoord
loop(leftIndex, rightIndex + 1)
}
case _ =>
}
}
}
loop()
builder.result
}
sort(seq)
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With