David Lowe's SIFT -- Question about scale space and image coordinates (weird offset problem)

Question

I realize this is a highly specialized question.. but here goes. I am using an implementation of SIFT to find matches on two images. With the current implementation that I have, when I match an image with is 90 or 180 degree version, I get matches that are off by around half a pixel consistently but its varies within a range. So for example, if a match is found at pixel coordinate (x,y) in im1, then the corresponding match in its 90 degree rotated image im2 is at (x,y + 0.5). If i use a 180 degree image then the offset appears in both x and y coordinates and only in the x if I use a 270 degree (-90) rotated image.

1) First of all, I am assuming SIFT should give me the same matching location in a rotated image. An implicit assumption is that the rotation does not change the pixel values of the image which I confirmed is true. (I use IRFAN View to rotate and save as a .pgm and the pixel values remain unchanged).

2) I have other implementations which do not give this offset.

3) I am assuming this offset is programming related and possibly has to do with conversion from scale-space keypoint coordinates to image-space key-point coordinate.

I'm hoping someone has run across this problem or can point me out to a reference on how to convert from scale-space to image-space.

peakxu · Accepted Answer

Contrary to Mikola's assertions, it is possible to get scale and orientation out of SIFT. SIFT attempts to find the scale with greatest DOG extrema (s) and also finds a dominant orientation (r). Each location vector for a SIFT feature returns (x, y, s, r)

To see how scale space converts to pixels, try VLFeat's implementation. In particular, use the vl_plotsiftdescriptor to plot the descriptors. You can see how s scales relative to pixels for this implementation. To figure out other implementations, find the same feature with both implementations and see how the scale factor s differs.

Mikola · Answer

First a general comment:

SIFT just gives you features with x,y locations in pixel coordinates. It doesn't tell you anything directly about the scale or rotation of a given feature by design, and in fact it is the defining characteristic of SIFT that the feature vector is invariant under these types of transformations (ie this is why SIFT works).~~ (EDIT: This is wrong, WTF was I thinking when I wrote this?)

An offset of 0.5 pixels is insignificant, and there could be a large a number of possible explanations for this difference. One possibility is that the two implementations use a different origin coordinates; for example one puts the origin in the middle, while the other puts it at a corner. This can affect rounding, which could account for a difference of 0.5 in the reported pixel locations. Another possibility is that they differ on the number of rotation samples used; or perhaps on the number of scales which are considered. Changing either of these parameters could conceivably affect the observed feature by as much as a few pixels. Of course this is all pure speculation, since one would have to actually see the implementation to say anything definitive.

Now to address your more specific concerns:

This is a bad assumption. Rectilinearly sampled images are not in general invariant under rotations. Even rotating by multiples of 90 degrees can cause problems if your SIFT implementation samples a number of rotations which is not a multiple of 4. However, with enough samples you can expect it to get near the correct result, but it will almost never be exact (except in some very special degenerate situations).
How do you know they are giving the right offset? They may all be clones or ports of the same code base and could have similar bugs.
I don't know why you would expect it to be the same, since SIFT relies on a number of internal twiddle factors which can vary between implementations.

Finally, I am not sure what you mean by "convert from scale-space to image-space". Scale-space is defined for images -- not points -- and there is no 1:1 mapping between the coordinates in scale space and image space. If you just want to translate a scale space image into a regular image, just take the 0-scale slice. If you want to turn an image into a scale space representation, convolve it with a bunch of Gaussians of varying radii.

David Lowe's SIFT -- Question about scale space and image coordinates (weird offset problem)

Tags:

c++

image-processing

computer-vision

sift

Mustafa

2 Answers

peakxu

Mikola

Recent Activity

Donate For Us

David Lowe's SIFT -- Question about scale space and image coordinates (weird offset problem)

Tags:

c++

image-processing

computer-vision

sift

Mustafa

2 Answers

peakxu

Mikola

Related questions

Recent Activity

Donate For Us