Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching thermographic / non-thermographic images with OpenCV feature detectors

I’m currently working on building software which can match infrared and non-infrared images taken from a fixed point using a thermographic camera.

The use case is the following: A picture is taken from using a tripod of a fixed point using an infrared thermographic camera and a standard camera. After taking the pictures, the photographer wants to match images from each camera. There will be some scenarios where an image is taken with only one camera as the other image type is unnecessary. Yes, it may be possible for the images to be matched using timestamps, but the end-user demands they be matched using computer vision.

I've looked at other image matching posts on StackOverflow -- they have often focused on using histogram matching and feature detectors. Histogram matching is not an option here, as we cannot match colors between the two image types. As a result, I've developed an application which does feature detection. In addition to standard feature detection, I’ve also added some logic which says that two keypoints cannot be matching if they are not within a certain margin of each other (a keypoint on the far left of the query image cannot match a keypoint on the far right of the candidate image) -- this process occurs in stage 3 of the code below.

To give you an idea of the current output, here is a valid and invalid match produced -- note the thermographic image is on the left. My objective is to improve the accuracy of the matching process.

Valid match: Valid match

Invalid match: Invalid match

Here is the code:

    // for each candidate image specified on the command line, compare it against the query image
        Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE); // loading query image
        for(int candidateImage = 0; candidateImage < (argc - 2); candidateImage++) {
            Mat img2 = imread(argv[candidateImage + 2], CV_LOAD_IMAGE_GRAYSCALE); // loading candidate image
            if(img1.empty() || img2.empty())
            {
                printf("Can't read one of the images\n");
                return -1;
            }

            // detecting keypoints
            SiftFeatureDetector detector;
            vector<KeyPoint> keypoints1, keypoints2;
            detector.detect(img1, keypoints1);
            detector.detect(img2, keypoints2);

            // computing descriptors
            SiftDescriptorExtractor extractor;
            Mat descriptors1, descriptors2;
            extractor.compute(img1, keypoints1, descriptors1);
            extractor.compute(img2, keypoints2, descriptors2);

            // matching descriptors
            BFMatcher matcher(NORM_L1);
            vector< vector<DMatch> > matches_stage1;
            matcher.knnMatch(descriptors1, descriptors2, matches_stage1, 2);

            // use nndr to eliminate weak matches
            float nndrRatio = 0.80f;
            vector< DMatch > matches_stage2;
            for (size_t i = 0; i < matches_stage1.size(); ++i)
            {
                if (matches_stage1[i].size() < 2)
                    continue;
                const DMatch &m1 = matches_stage1[i][0];
                const DMatch &m2 = matches_stage1[i][3];
                if(m1.distance <= nndrRatio * m2.distance)
                    matches_stage2.push_back(m1);
            }

            // eliminate points which are too far away from each other
            vector<DMatch> matches_stage3;
            for(int i = 0; i < matches_stage2.size(); i++) {
                Point queryPt = keypoints1.at(matches_stage2.at(i).queryIdx).pt;
                Point trainPt = keypoints2.at(matches_stage2.at(i).trainIdx).pt;

                // determine the lowest number here
                int lowestXAxis;
                int greaterXAxis;
                if(queryPt.x <= trainPt.x) { lowestXAxis = queryPt.x; greaterXAxis = trainPt.x; }
                else { lowestXAxis = trainPt.x; greaterXAxis = queryPt.x; }

                int lowestYAxis;
                int greaterYAxis;
                if(queryPt.y <= trainPt.y) { lowestYAxis = queryPt.y; greaterYAxis = trainPt.y; }
                else { lowestYAxis = trainPt.y; greaterYAxis = queryPt.y; }

                // determine if these points are acceptable
                bool acceptable = true;
                if( (lowestXAxis + MARGIN) < greaterXAxis) { acceptable = false; }
                if( (lowestYAxis + MARGIN) < greaterYAxis) { acceptable = false; }
                if(acceptable == false) { continue; }

                //// it's acceptable -- provide details, perform input
                matches_stage3.push_back(matches_stage2.at(i));
            }

            // output how many individual matches were found for this training image
            cout << "good matches found for candidate image # " << (candidateImage+1) << " = " << matches_stage3.size() << endl;

I used this sites code as an example. The problem I’m having is that the feature detection is not reliable, and I seem to be missing the purpose of the NNDR ratio. I understand that I am finding K possible matches for each point within the query image and that I have K = 2. But I don’t understand the purpose of this part within the example code:

vector< DMatch > matches_stage2;
for (size_t i = 0; i < matches_stage1.size(); ++i)
{
    if (matches_stage1[i].size() < 2)
        continue;
    const DMatch &m1 = matches_stage1[i][0];
    const DMatch &m2 = matches_stage1[i][1];
    if(m1.distance <= nndrRatio * m2.distance)
        matches_stage2.push_back(m1);
}

Any ideas on how I can improve this further? Any advice would be appreciated as always.

like image 441
BSchlinker Avatar asked May 16 '13 20:05

BSchlinker


1 Answers

The validation you currently use

First stage

First of all, let's talk about the part of the code that you don't understand. The idea is to keep only "strong matches". Actually, your call of knnMatch finds, for each descriptor, the best two correspondences with respect to the Euclidean distance "L2"(*). This does not mean at all that these are good matches in reality, but only that those feature points are quite similar.

Let me try to explain your validation now, considering only one feature point in image A (it generalizes to all of them):

  • You match the descriptor of this point against image B
  • You get the two best correspondences with respect to the Euclidean distance (i.e. you get the two most similar points in image B)
  • If the distance from your point to the best correspondence is much smaller than the one from your point to the second-best correspondence, then you assume that it is a good match. In other words, there was only one point in image B that was really similar (i.e. had a small Euclidean distance) to the point in image A.
  • If both matches are too similar (i.e. !(m1.distance <= nndrRatio * m2.distance)), then you cannot really discriminate between them and you don't consider the match.

This validation has some major weaknesses, as you have probably observed:

  • Firstly, if the best matches you get from knnMatch are both terribly bad, then the best of those might be accepted anyway.
  • It does not take the geometry into account. Therefore, a point far on the left of the image might be similar to a point far on the right, even though in reality they clearly don't match.

* EDIT: Using SIFT, you describe each feature point in your image using a floating-point vector. By computing the Euclidean distance between two vectors, you know how similar they are. If both vectors are exactly the same, then the distance is zero. The smaller the distance, the more similar the points. But this is not geometric: a point on the left-hand side of your image might look similar to a point in the right-hand side. So you first find the pairs of points that look similar (i.e. "This point in A looks similar to this point in B because the Euclidean distance between their feature vectors is small") and then you need to verify that this match is coherent (i.e. "It is possible that those similar points are actually the same because they both are on the left-hand side of my image" or "They look similar, but that is incoherent because I know that they must lie on the same side of the image and they don't").

Second stage

What you do in your second stage is interesting since it considers the geometry: knowing that both images were taken from the same point (or almost the same point?), you eliminate matches that are not in the same region in both images.

The problem I see with this is that if both images weren't taken at the exact same position with the very same angle, then it won't work.

Proposition to improve your validation further

I would personally work on the second stage. Even though both images aren't necessarily exactly the same, they describe the same scene. And you can take advantage of the geometry of it.

The idea is that you should be able to find a transformation from the first image to the second one (i.e. the way in which a point moved from image A to image B is actually linked to the way all of the points moved). And in your situation, I would bet that a simple homography is adapted.

Here is my proposition:

  • Compute the matches using knnMatch, and keep stage 1 (you might want to try removing it later and observe the consequences)
  • Compute the best possible homography transform between those matches using cv::findHomography (choose the RANSAC algorithm).
  • findHomography has a mask output that will give you the inliers (i.e. the matches that were used to compute the homography transform).

The inliers will most probably be good matches since there will be coherent geometrically.

EDIT: I just found an example using findHomography here.

like image 56
JonasVautherin Avatar answered Oct 06 '22 16:10

JonasVautherin