Database supporting fast approximate nearest neighbor queries [closed]

Question

Is there a database that supports fast approximate nearest neighbor queries in high-dimensional vector spaces?

I'm looking for a database that would fit the following use case:

Works for millions of points
Works for hundreds-thousands of dimensions
Potentially uses cover trees or locality sensitive hashing for indexing

Does a robust implementation of this exist?

smocking · Accepted Answer

There's the ANN library which works pretty well for high-dimensional large datasets, but it's not a full "database" and is not a distributed solution.

There's a startup called SpaceCurve (no relation to me) working on a commercial spatial database, so depending on your needs and budget they might be worth looking into.

As a piece of advice: you should think deeply about what "nearest neighbor" really means when you're talking about "hundreds-thousands of dimensions". If you take a million random points in a 20-dimensional cube, the average distance between any two nearest neighbors is already about half the length of an edge of the cube.

This only gets worse exponentially as you add dimensions. Once you're talking about hundreds dimensions you really need impossibly large amounts of points (like > 10³⁰) if they're somewhat uniformly distributed; and if they're differently distributed you're better off with other classification approaches.

Marek Galovic · Answer

Take a look at AnnDB.

It's a distributed approximate nearest neighbors database that allows you to horizontally scale to millions of high-dimensional vectors.

Disclaimer: I'm the author of AnnDB.

Database supporting fast approximate nearest neighbor queries [closed]

Tags:

algorithm

database

nearest-neighbor

Ben Hamner

Video Answer

2 Answers

smocking

Marek Galovic

Recent Activity

Donate For Us

Database supporting fast approximate nearest neighbor queries [closed]

Tags:

algorithm

database

nearest-neighbor

Ben Hamner

Video Answer

2 Answers

smocking

Marek Galovic

Related questions

Recent Activity

Donate For Us