Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Approximate, incremental nearest-neighbour algorithm for moving bodies

Bounty

This question raises several issues. The bounty will go to an answer which addresses them holistically.


Here's a problem I've been playing with.

NOTE I'm especially interested in solutions that are not based in Euclidian space.

There is a set of Actors which form a crowd of size K. The distance d(ActorA,ActorB) is easily computable for any two actors (solutions should work for various definitions of 'distance') and we can find the set of N nearest neighbours for any given Actor using any of a number of established algorithms.

This neighbour set is correct at the first instant but the Actors are always moving and I want to maintain the evolving list of N nearest neighbours for each Actor. What I am interested in is approximate solutions which are more efficient than perfect solutions.

  1. Solutions should converge to correctness after errors have been introduced.
  2. It is acceptable to sometimes perform a full recomputation if the errors become too large but detecting these errors should be cheap.

So far I have been using a friend-of-a-friend algorithm:

recompute_nearest (Actor A)
{
    Actor f_o_f [N*N];
    for each Actor n in A.neighbours
        append n to f_o_f if n != A and n not in f_o_f

    Distance distances [N*N];
    for 0 <= i < f_o_f.size
        distances [i] = distance (A, f_o_f [i])

    sort (f_o_f, distances)
    A .neighbours = first N from f_o_f
}

This performs reasonably well when the crowd is slow-moving and N is suitably large. It converges after small errors, satisfying the first criteria, but

  • I don't have good way to detect large errors,
  • I have no quantitative description of the size and frequency of errors,
  • it converges in practice but I can't prove that it always will.

Can you help with any of these points?

Also, do you know of any alternative approaches which perform well

  • when the crowd is fast-moving,
  • when some actors are fast-moving,
  • when N is small,
  • when the crowd is sparse in some places and dense in others,
  • or with particular spacial-indexing algorithms?

The extension I'm working on at the moment is to generalise friend-of-a-friend to take the friend-of-a-friend-of-a-friend in cases when a neighbour is fast-moving. I suspect that this doesn't scale to well and it's hard to derive the right parameters without a quantification of the errors.

I welcome all suggestions! It's a fun little problem :-)


Notable Suggestions So Far

Fexvez: sample random extra neighbours, sample size depending on the speed of the Agent. Sampling from the area it's about to move into would probably also help.

Resample the neighbours when an agents speed*delta_time exceeds the distance to the furthest known neighbour.

Maintain the Delaunay triangulation which is a superset of the nearest-neighbour graph. Only accounts for one nearest neighbour.

David Mount's ANN library Doesn't seem to handle moving bodies.

like image 865
spraff Avatar asked Jul 29 '11 10:07

spraff


People also ask

How does approximate nearest neighbor work?

In approximately nearest neighbors (ANN), we build index structures that narrow down the search space. The implicit neighborhoods in such indexes also help reduce the problem of high dimensions. You can roughly divide the approaches used for ANNs into whether or not they can be implemented using an inverse index.

What is the nearest-neighbor algorithm graph theory?

The k-nearest neighbor graph (k-NNG) is a graph in which two vertices p and q are connected by an edge, if the distance between p and q is among the k-th smallest distances from p to other objects from P.

What is nearest-neighbor algorithm in math?

One strategy for solving the traveling salesman problem is the nearest-neighbor algorithm. Simply stated, when given a choice of vertices this algorithm selects the nearest (i.e., least cost) neighbor. In our applet below your goal is to select a Hamiltonian circuit using the nearest-neighbor algorithm.


2 Answers

A very simple and very fast method from statistics is to use random linear projections. These can help you determine clusters and neighbors very quickly. With more projections, you get more accuracy (addressing your question about errors, I believe).

This paper offers an extensive quantitative analysis of several methods, including a new method (DPES) that is related to RLP.

This paper addresses use of RLP including for distance preservation even in the context of moving points.

This paper addresses RLP for motion planning and details several heuristics.

RLP methods are:

  1. Very fast
  2. Lead to approximations that are tunable for accuracy and speed
  3. Distance and angle preserving (provable)
  4. Easily scaled to large dimensions and large #s of objects
  5. Useful for dimensionality reduction
  6. Lead to compact projections (e.g. one could project into a hierarchical binary partitioning)
  7. Flexible: you can project into whatever space you feel would be good for you - usually R^d, but projecting into 2^d (i.e. binary space of dimension d) is also allowable, only subject to a reduction in accuracy for a given # of projections.
  8. Statistically interesting

After embedding into a lower dimensional space, neighbor calculations are very easy, as projections that are, say, binned in the same regions (if you bin the projections into a grid) are very likely to be close in the original space.

Although the dimensionality of the original data is small (even 10 is small), the ability to rapidly project into a pre-selected grid is very useful for identifying and counting neighbors.

Finally, you only need to update for those objects whose location (or relative location, if you're centering and scaling the data) have changed.

For related works, look into the Johnson-Lindenstrauss lemma.

like image 86
Iterator Avatar answered Nov 15 '22 13:11

Iterator


Have you tried using a quad tree?

That is, recursively partition the "world" into a graph with four subnodes each. The tree can then quickly check which objects are inside a particular square of the world and discard the rest. A very effective culling technique often used for improving performance of collision detection in games.

If this is 3D, extend it to an octree.

like image 42
Martin Wickman Avatar answered Nov 15 '22 13:11

Martin Wickman