Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Searching an Image Database Using SIFT

Several questions have been asked about the SIFT algorithm, but they all seem focussed on a simple comparison between two images. Instead of determining how similar two images are, would it be practical to use SIFT to find the closest matching image out of a collection of thousands of images? In other words, is SIFT scalable?

For example, would it be practical to use SIFT to generate keypoints for a batch of images, store the keypoints in a database, and then find the ones that have the shortest Euclidean distance to the keypoints generated for a "query" image?

When calculating the Euclidean distance, would you ignore the x, y, scale, and orientation parts of the keypoints, and only look at the descriptor?

like image 811
Cerin Avatar asked Mar 02 '11 04:03

Cerin


People also ask

Why do we use SIFT in image classification?

SIFT helps locate the local features in an image, commonly known as the 'keypoints' of the image. These keypoints are scale & rotation invariant that can be used for various computer vision applications, like image matching, object detection, scene detection, etc.

What is SIFT in image processing?

Scale-Invariant Feature Transform (SIFT)—SIFT is an algorithm in computer vision to detect and describe local features in images. It is a feature that is widely used in image processing. The processes of SIFT include Difference of Gaussians (DoG) Space Generation, Keypoints Detection, and Feature Description.

What is the difference between SIFT and surf?

SIFT is an algorithm used to extract the features from the images. SURF is an efficient algorithm is same as SIFT performance and reduced in computational complexity. SIFT algorithm presents its ability in most of the situation but still its performance is slow.

What are Keypoints in SIFT?

A SIFT keypoint is a circular image region with an orientation. It is described by a geometric frame of four parameters: the keypoint center coordinates x and y, its scale (the radius of the region), and its orientation (an angle expressed in radians).


1 Answers

There are several approaches.

One popular approach is the so called bag of words representation which does matching based solely upon how many descriptors match, thus ignoring the location part consisting of (x, y, scale, and orientation) and just look at the descriptor.

Efficient querying of a large database may use approximate methods like locality sensitive hashing

Other methods may involve vocabulary trees or other data structures.

For an efficient method that also takes into account location information, check out pyramid match kernels

like image 111
peakxu Avatar answered Oct 29 '22 09:10

peakxu