The point of the application is to recognize an image from an already set list of images. The list of images have had their SIFT descriptors extracted and saved in files. Nothing interesting here: <pre class="prettyprint"><code>std::vector<cv::KeyPoint> detectedKeypoints; cv::Mat objectDescriptors; // Extract data cv::SIFT sift; sift.detect(image, detectedKeypoints); sift.compute(image, detectedKeypoints, objectDescriptors); // Save the file cv::FileStorage fs(file, cv::FileStorage::WRITE); fs << "descriptors" << objectDescriptors; fs << "keypoints" << detectedKeypoints; fs.release(); </code></pre> Then the device takes a picture. SIFT descriptors are extracted in the same way. The idea now was to compare the descriptors to the ones from the files. I am doing that using the FLANN matcher from OpenCV. I am trying to quantify the similarity, image by image. After going through the whole list I should have the best match. <pre class="prettyprint"><code>const cv::Ptr<cv::flann::IndexParams>& indexParams = new cv::flann::KDTreeIndexParams(1); const cv::Ptr<cv::flann::SearchParams>& searchParams = new cv::flann::SearchParams(64); // Match using Flann cv::Mat indexMat; cv::FlannBasedMatcher matcher(indexParams, searchParams); std::vector< cv::DMatch > matches; matcher.match(objectDescriptors, readDescriptors, matches); </code></pre> After matching I understand that I get a list of the closest found distances between the feature vectors. I find the minimum distance and, using it I can count "good matches" and even get a list of the respective points: <pre class="prettyprint"><code>// Count the number of mathes where the distance is less than 2 * min_dist int goodCount = 0; for (int i = 0; i < objectDescriptors.rows; i++) { if (matches[i].distance < 2 * min_dist) { ++goodCount; // Save the points for the homography calculation obj.push_back(detectedKeypoints[matches[i].queryIdx].pt); scene.push_back(readKeypoints[matches[i].trainIdx].pt); } } </code></pre> I'm showing easy parts of the code just to make this more easy to follow, I know some of it doesn't need to be here. Continuing, I was hoping that simply counting the number of good matches like this would be enough, but it turned out to mostly just point me to the image with the most descriptors. What I tried to after this was computing the homography. The aim was to compute it and see whether it's a valid homoraphy or not. The hope was that a good match, and only a good match, would have a homography that is a good transformation. Creating the homography was done simply using cv::findHomography on the obj and scene which are std::vector< cv::Point2f>. I checked the validity of the homography using some code I found online: <pre class="prettyprint"><code>bool niceHomography(cv::Mat H) { std::cout << H << std::endl; const double det = H.at<double>(0, 0) * H.at<double>(1, 1) - H.at<double>(1, 0) * H.at<double>(0, 1); if (det < 0) { std::cout << "Homography: bad determinant" << std::endl; return false; } const double N1 = sqrt(H.at<double>(0, 0) * H.at<double>(0, 0) + H.at<double>(1, 0) * H.at<double>(1, 0)); if (N1 > 4 || N1 < 0.1) { std::cout << "Homography: bad first column" << std::endl; return false; } const double N2 = sqrt(H.at<double>(0, 1) * H.at<double>(0, 1) + H.at<double>(1, 1) * H.at<double>(1, 1)); if (N2 > 4 || N2 < 0.1) { std::cout << "Homography: bad second column" << std::endl; return false; } const double N3 = sqrt(H.at<double>(2, 0) * H.at<double>(2, 0) + H.at<double>(2, 1) * H.at<double>(2, 1)); if (N3 > 0.002) { std::cout << "Homography: bad third row" << std::endl; return false; } return true; } </code></pre> I don't understand the math behind this so, while testing, I sometimes replaced this function with a simple check whether the determinant of the homography was positive. The problem is that I kept having issues here. The homographies were either all bad, or good when they shouldn't have been (when I was checking only the determinant). I figured I should actually use the homography and for a number of points just compute their position in the destination image using their position in the source image. Then I would compare these average distances, and I would ideally get a very obvious smaller average distance in the case of the correct image. This did not work at all. All the distances were colossal. I thought I might have used the homography the other way around to calculate the right position, but switching obj and scene with each other gave similar results. Other things I tried were SURF descriptors instead of SIFT, BFMatcher (brute force) instead of FLANN, getting the n smallest distances for every image instead of a number depending on the minimum distance, or getting distances depending on a global maximum distance. None of these approaches gave me definite good results, and I feel stuck now. My only next strategy would be to sharpen the images or even turn them to binary images using some local threshold or some algorithms used for segmentation. I am looking for any suggestions or mistake anyone can see in my work. I don't know whether this is relevant, but I added some of the images I am testing this on. Many times in the test images most of the SIFT vectors come from the frame (higher contrast) than the painting. This is why I'm thinking sharpening the images might work, but I don't want to go deeper in case something I did previously is wrong. The gallery of images is here with the descriptions in the titles. The images are of quite high resolution, please view in case it might give some hints.

You can try to test if when matching, the lines between the source image and the target image are relatively parallel. If it's not a correct match, then you'd have a lot of noise and the lines won't be parallel. See the attached image which shows a correct match (using SURF and BF) - all the lines are mostly parallel (though I should point out that this is an easy example). <img src="https://i.stack.imgur.com/PQdhc.jpg" alt="enter image description here">

Recognizing an image from a list with OpenCV SIFT using the FLANN matching

Tags:

c++

opencv

image-recognition

sift

flann

The point of the application is to recognize an image from an already set list of images. The list of images have had their SIFT descriptors extracted and saved in files. Nothing interesting here:

std::vector<cv::KeyPoint> detectedKeypoints;
cv::Mat objectDescriptors;

// Extract data
cv::SIFT sift;
sift.detect(image, detectedKeypoints);
sift.compute(image, detectedKeypoints, objectDescriptors);

// Save the file
cv::FileStorage fs(file, cv::FileStorage::WRITE);
fs << "descriptors" << objectDescriptors;
fs << "keypoints" << detectedKeypoints;
fs.release();

Then the device takes a picture. SIFT descriptors are extracted in the same way. The idea now was to compare the descriptors to the ones from the files. I am doing that using the FLANN matcher from OpenCV. I am trying to quantify the similarity, image by image. After going through the whole list I should have the best match.

const cv::Ptr<cv::flann::IndexParams>& indexParams = new cv::flann::KDTreeIndexParams(1);
const cv::Ptr<cv::flann::SearchParams>& searchParams = new cv::flann::SearchParams(64);

// Match using Flann
cv::Mat indexMat;
cv::FlannBasedMatcher matcher(indexParams, searchParams);
std::vector< cv::DMatch > matches;
matcher.match(objectDescriptors, readDescriptors, matches);

After matching I understand that I get a list of the closest found distances between the feature vectors. I find the minimum distance and, using it I can count "good matches" and even get a list of the respective points:

// Count the number of mathes where the distance is less than 2 * min_dist
int goodCount = 0;
for (int i = 0; i < objectDescriptors.rows; i++)
{
    if (matches[i].distance <  2 * min_dist)
    {
        ++goodCount;
        // Save the points for the homography calculation
        obj.push_back(detectedKeypoints[matches[i].queryIdx].pt);
        scene.push_back(readKeypoints[matches[i].trainIdx].pt);
    }
}

I'm showing easy parts of the code just to make this more easy to follow, I know some of it doesn't need to be here.

Continuing, I was hoping that simply counting the number of good matches like this would be enough, but it turned out to mostly just point me to the image with the most descriptors. What I tried to after this was computing the homography. The aim was to compute it and see whether it's a valid homoraphy or not. The hope was that a good match, and only a good match, would have a homography that is a good transformation. Creating the homography was done simply using cv::findHomography on the obj and scene which are std::vector< cv::Point2f>. I checked the validity of the homography using some code I found online:

bool niceHomography(cv::Mat H)
{
    std::cout << H << std::endl;

    const double det = H.at<double>(0, 0) * H.at<double>(1, 1) - H.at<double>(1, 0) * H.at<double>(0, 1);
    if (det < 0)
    {
        std::cout << "Homography: bad determinant" << std::endl;
        return false;
    }

    const double N1 = sqrt(H.at<double>(0, 0) * H.at<double>(0, 0) + H.at<double>(1, 0) * H.at<double>(1, 0));
    if (N1 > 4 || N1 < 0.1)
    {
        std::cout << "Homography: bad first column" << std::endl;
        return false;
    }

    const double N2 = sqrt(H.at<double>(0, 1) * H.at<double>(0, 1) + H.at<double>(1, 1) * H.at<double>(1, 1));
    if (N2 > 4 || N2 < 0.1)
    {
        std::cout << "Homography: bad second column" << std::endl;
        return false;
    }

    const double N3 = sqrt(H.at<double>(2, 0) * H.at<double>(2, 0) + H.at<double>(2, 1) * H.at<double>(2, 1));
    if (N3 > 0.002)
    {
        std::cout << "Homography: bad third row" << std::endl;
        return false;
    }

    return true;
}

I don't understand the math behind this so, while testing, I sometimes replaced this function with a simple check whether the determinant of the homography was positive. The problem is that I kept having issues here. The homographies were either all bad, or good when they shouldn't have been (when I was checking only the determinant).

I figured I should actually use the homography and for a number of points just compute their position in the destination image using their position in the source image. Then I would compare these average distances, and I would ideally get a very obvious smaller average distance in the case of the correct image. This did not work at all. All the distances were colossal. I thought I might have used the homography the other way around to calculate the right position, but switching obj and scene with each other gave similar results.

Other things I tried were SURF descriptors instead of SIFT, BFMatcher (brute force) instead of FLANN, getting the n smallest distances for every image instead of a number depending on the minimum distance, or getting distances depending on a global maximum distance. None of these approaches gave me definite good results, and I feel stuck now.

My only next strategy would be to sharpen the images or even turn them to binary images using some local threshold or some algorithms used for segmentation. I am looking for any suggestions or mistake anyone can see in my work.

I don't know whether this is relevant, but I added some of the images I am testing this on. Many times in the test images most of the SIFT vectors come from the frame (higher contrast) than the painting. This is why I'm thinking sharpening the images might work, but I don't want to go deeper in case something I did previously is wrong.

The gallery of images is here with the descriptions in the titles. The images are of quite high resolution, please view in case it might give some hints.

349

asked May 14 '14 12:05

octafbr

1 Answers

You can try to test if when matching, the lines between the source image and the target image are relatively parallel. If it's not a correct match, then you'd have a lot of noise and the lines won't be parallel.

See the attached image which shows a correct match (using SURF and BF) - all the lines are mostly parallel (though I should point out that this is an easy example).

enter image description here

196

answered Nov 05 '22 17:11

GilLevi

Related questions
                            
                                How to search in this tree?
                            
                                User defined qualifiers
                            
                                Will the new expression ever return a pointer to an array?
                            
                                Bug with robust mutex
                            
                                Why is not the new Visual Studio runtime part of Windows
                            
                                Could stack pop operation return the value safely in C++11
                            
                                Using Qt/C++ to call Java code through JNI. FindClass does not find class
                            
                                Get function parameters name visual studio
                            
                                Is there a way to generate a c++ class from a python class and bind it a compile time?
                            
                                QMake - how to add a post-build step after EVERY build
                            
                                Problems with the universal factory method and the variadic templates
                            
                                How to take screen shot in Linux without X11 or /dev/fb0?
                            
                                Boost Asio io_service destructor hangs on OS X
                            
                                Forward declaring enum in C++ defined in C
                            
                                Calling boost::asio::write() with a non-valid socket crashed my Blackberry 10 application
                            
                                What is the correct way to instantiate c++11 random facilities
                            
                                Delay-loading of opengl32.dll fails with Qt5
                            
                                Using quaternions for tangent space normal mapping - Problems I'm having
                            
                                C++11 array initialization with a non-copyable type with explicit constructor
                            
                                MSVC direct constructor call extension

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With