Efficient nearest neighbour search in Scala

Tags:

Let this coordinates class with the Euclidean distance,

case class coord(x: Double, y: Double) {
  def dist(c: coord) = Math.sqrt( Math.pow(x-c.x, 2) + Math.pow(y-c.y, 2) ) 
}

and let a grid of coordinates, for instance

val grid = (1 to 25).map {_ => coord(Math.random*5, Math.random*5) }

Then for any given coordinate

val x = coord(Math.random*5, Math.random*5)

the nearest points to x are

val nearest = grid.sortWith( (p,q) => p.dist(x) < q.dist(x) )

so the first three closest are nearest.take(3).

Is there a way to make these calculations more time efficient especially for the case of a grid with one million points ?

456

asked Sep 06 '14 05:09

elm

1 Answers

I'm not sure if this is helpful (or even stupid), but I thought of this:

You use a sort-function to sort ALL elements in the grid and then pick the first k elements. If you consider a sorting algorithm like recursive merge-sort, you have something like this:

Split collection in half
Recurse on both halves
Merge both sorted halves

Maybe you could optimize such a function for your needs. The merging part normally merges all elements from both halves, but you are only interested in the first k that result from the merging. So you could only merge until you have k elements and ignore the rest.

So in the worst-case, where k >= n (n is the size of the grid) you would still only have the complexity of merge-sort. O(n log n) To be honest I'm not able to determine the complexity of this solution relative to k. (too tired for that at the moment)

Here is an example implementation of that solution (it's definitely not optimal and not generalized):

def minK(seq: IndexedSeq[coord], x: coord, k: Int) = {

  val dist = (c: coord) => c.dist(x)

  def sort(seq: IndexedSeq[coord]): IndexedSeq[coord] = seq.size match {
    case 0 | 1 => seq
    case size => {
      val (left, right) = seq.splitAt(size / 2)
      merge(sort(left), sort(right))
    }
  }

  def merge(left: IndexedSeq[coord], right: IndexedSeq[coord]) = {

    val leftF = left.lift
    val rightF = right.lift

    val builder = IndexedSeq.newBuilder[coord]

    @tailrec
    def loop(leftIndex: Int = 0, rightIndex: Int = 0): Unit = {
      if (leftIndex + rightIndex < k) {
        (leftF(leftIndex), rightF(rightIndex)) match {
          case (Some(leftCoord), Some(rightCoord)) => {
            if (dist(leftCoord) < dist(rightCoord)) {
              builder += leftCoord
              loop(leftIndex + 1, rightIndex)
            } else {
              builder += rightCoord
              loop(leftIndex, rightIndex + 1)
            }
          }
          case (Some(leftCoord), None) => {
            builder += leftCoord
            loop(leftIndex + 1, rightIndex)
          }
          case (None, Some(rightCoord)) => {
            builder += rightCoord
            loop(leftIndex, rightIndex + 1)
          }
          case _ =>
        }
      }
    }

    loop()

    builder.result
  }

  sort(seq)
}

139

answered Sep 28 '22 09:09

Kigyo

Related questions
                            
                                Find a local minimum in a 2-D array [duplicate]
                            
                                Synchronize two ordered lists
                            
                                Finding number of concurrent events given start and end times
                            
                                Find all the numbers in the range [a, b] that are not in the given std::set S
                            
                                Approach to learning algorithms using a specific language
                            
                                can counting contiguous regions in a bitmap be improved over O(r * c)?
                            
                                Big O(h) vs. Big O(logn) in trees
                            
                                Using Strongly Connected Components Algo as Cycle Detection
                            
                                I need a better algorithm to solve this
                            
                                Implementation of popular algorithms in JavaScript [closed]
                            
                                C++ using standard algorithms with strings, count_if with isdigit, function cast
                            
                                Optimal shift scheduling algorithm
                            
                                Why l.insert(0, i) is slower than l.append(i) in python?
                            
                                Find the number of couples with the same difference in a sorted array
                            
                                How to do a range update in Binary Indexed Tree or Fenwick Tree?
                            
                                Maximum Devastation to be caused if a building with height h causes all h-1 buildings to its right to collapse
                            
                                What's the algorithm of 'set.intersection()' in python?
                            
                                Ordering CONCAVE polygon vertices in (counter)clockwise?
                            
                                Finding a better way to count matrices
                            
                                Efficiently grouping a list of coordinates points by location in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Efficient nearest neighbour search in Scala

Tags:

algorithm

scala

nearest-neighbor

kdtree

r-tree

elm

People also ask

1 Answers

Kigyo

Recent Activity

Donate For Us