Comparing SIFT features stored in a mysql database

Tags:

I'm currently extending an image library used to categorize images and i want to find duplicate images, transformed images, and images that contain or are contained in other images.
I have tested the SIFT implementation from OpenCV and it works very well but would be rather slow for multiple images. Too speed it up I thought I could extract the features and save them in a database as a lot of other image related meta data is already being held there.

What would be the fastest way to compare the features of a new images to the features in the database?
Usually comparison is done calculating the euclidean distance using kd-trees, FLANN, or with the Pyramid Match Kernel that I found in another thread here on SO, but haven't looked much into yet.

Since I don't know of a way to save and search a kd-tree in a database efficiently, I'm currently only seeing three options:
* Let MySQL calculate the euclidean distance to every feature in the database, although I'm sure that that will take an unreasonable time for more than a few images.
* Load the entire dataset into memory at the beginning and build the kd-tree(s). This would probably be fast, but very memory intensive. Plus all the data would need to be transferred from the database.
* Saving the generated trees into the database and loading all of them, would be the fastest method but also generate high amounts of traffic as with new images the kd-trees would have to be rebuilt and send to the server.

I'm using the SIFT implementation of OpenCV, but I'm not dead set on it. If there is a feature extractor more suitable for this task (and roughly equally robust) I'm glad if someone could suggest one.

859

asked Aug 17 '10 18:08

Darcara

1 Answers

So I basically did something very similar to this a few years ago. The algorithm you want to look into was proposed a few years ago by David Nister, the paper is: "Scalable Recognition with a Vocabulary Tree". They pretty much have an exact solution to your problem that can scale to millions of images.

Here is a link to the abstract, you can find a download link by googleing the title. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1641018

The basic idea is to build a tree with a hierarchical k-means algorithm to model the features and then leverage the sparse distribution of features in that tree to quickly find your nearest neighbors... or something like that, it's been a few years since I worked on it. You can find a powerpoint presentation on the authors webpage here: http://www.vis.uky.edu/~dnister/Publications/publications.html

A few other notes:

I wouldn't bother with the pyramid match kernel, it's really more for improving object recognition than duplicate/transformed image detection.
I would not store any of this feature stuff in an SQL database. Depending on your application it is sometimes more effective to compute your features on the fly since their size can exceed the original image size when computed densely. Histograms of features or pointers to nodes in a vocabulary tree are much more efficient.
SQL databases are not designed for doing massive floating point vector calculations. You can store things in your database, but don't use it as a tool for computation. I tried this once with SQLite and it ended very badly.
If you decide to implement this, read the paper in detail and keep a copy handy while implementing it, as there are many minor details that are very important to making the algorithm work efficiently.

answered Oct 13 '22 00:10

Doug

Related questions
                            
                                Smart pointers in Qt [duplicate]
                            
                                How to tell the MinGW linker not to export all symbols?
                            
                                bitwise operations on vector<bool>
                            
                                Accessing a base class member in derived class
                            
                                Best way to add seed to Perlin noise?
                            
                                Can standard container templates be instantiated with incomplete types?
                            
                                OpenCV using k-means to posterize an image
                            
                                Move constructors and multiple inheritance
                            
                                Retrieving file descriptor from a std::fstream [duplicate]
                            
                                C++ Compile-Time offsetof inside a template
                            
                                Why there is no `static_if` in C++11 [closed]
                            
                                Is it ever a good idea to put virtual methods on a copyable type?
                            
                                No implicit conversion from std::string to std::string_view in C++17 (was in std::experimental::basic_string_view)
                            
                                C++: Can std::tuple_size/tuple_element be specialized?
                            
                                cmake: leading or trailing whitespace (policy CMP0004)
                            
                                structured binding initializer inside if statement does not compile
                            
                                Why structured bindings only work with auto
                            
                                constexpr static data member without initializer
                            
                                Lambda pack capture with ellipsis on both sides - what is the meaning?
                            
                                virtual function in private or protected inheritance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Comparing SIFT features stored in a mysql database

Tags:

c++

algorithm

database

computer-vision

sift

Darcara

People also ask

1 Answers

Doug

Recent Activity

Donate For Us