I’m currently working on building software which can match infrared and non-infrared images taken from a fixed point using a thermographic camera.
The use case is the following: A picture is taken from using a tripod of a fixed point using an infrared thermographic camera and a standard camera. After taking the pictures, the photographer wants to match images from each camera. There will be some scenarios where an image is taken with only one camera as the other image type is unnecessary. Yes, it may be possible for the images to be matched using timestamps, but the end-user demands they be matched using computer vision.
I've looked at other image matching posts on StackOverflow -- they have often focused on using histogram matching and feature detectors. Histogram matching is not an option here, as we cannot match colors between the two image types. As a result, I've developed an application which does feature detection. In addition to standard feature detection, I’ve also added some logic which says that two keypoints cannot be matching if they are not within a certain margin of each other (a keypoint on the far left of the query image cannot match a keypoint on the far right of the candidate image) -- this process occurs in stage 3 of the code below.
To give you an idea of the current output, here is a valid and invalid match produced -- note the thermographic image is on the left. My objective is to improve the accuracy of the matching process.
Valid match:
Invalid match:
Here is the code:
// for each candidate image specified on the command line, compare it against the query image
Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE); // loading query image
for(int candidateImage = 0; candidateImage < (argc - 2); candidateImage++) {
Mat img2 = imread(argv[candidateImage + 2], CV_LOAD_IMAGE_GRAYSCALE); // loading candidate image
if(img1.empty() || img2.empty())
{
printf("Can't read one of the images\n");
return -1;
}
// detecting keypoints
SiftFeatureDetector detector;
vector<KeyPoint> keypoints1, keypoints2;
detector.detect(img1, keypoints1);
detector.detect(img2, keypoints2);
// computing descriptors
SiftDescriptorExtractor extractor;
Mat descriptors1, descriptors2;
extractor.compute(img1, keypoints1, descriptors1);
extractor.compute(img2, keypoints2, descriptors2);
// matching descriptors
BFMatcher matcher(NORM_L1);
vector< vector<DMatch> > matches_stage1;
matcher.knnMatch(descriptors1, descriptors2, matches_stage1, 2);
// use nndr to eliminate weak matches
float nndrRatio = 0.80f;
vector< DMatch > matches_stage2;
for (size_t i = 0; i < matches_stage1.size(); ++i)
{
if (matches_stage1[i].size() < 2)
continue;
const DMatch &m1 = matches_stage1[i][0];
const DMatch &m2 = matches_stage1[i][3];
if(m1.distance <= nndrRatio * m2.distance)
matches_stage2.push_back(m1);
}
// eliminate points which are too far away from each other
vector<DMatch> matches_stage3;
for(int i = 0; i < matches_stage2.size(); i++) {
Point queryPt = keypoints1.at(matches_stage2.at(i).queryIdx).pt;
Point trainPt = keypoints2.at(matches_stage2.at(i).trainIdx).pt;
// determine the lowest number here
int lowestXAxis;
int greaterXAxis;
if(queryPt.x <= trainPt.x) { lowestXAxis = queryPt.x; greaterXAxis = trainPt.x; }
else { lowestXAxis = trainPt.x; greaterXAxis = queryPt.x; }
int lowestYAxis;
int greaterYAxis;
if(queryPt.y <= trainPt.y) { lowestYAxis = queryPt.y; greaterYAxis = trainPt.y; }
else { lowestYAxis = trainPt.y; greaterYAxis = queryPt.y; }
// determine if these points are acceptable
bool acceptable = true;
if( (lowestXAxis + MARGIN) < greaterXAxis) { acceptable = false; }
if( (lowestYAxis + MARGIN) < greaterYAxis) { acceptable = false; }
if(acceptable == false) { continue; }
//// it's acceptable -- provide details, perform input
matches_stage3.push_back(matches_stage2.at(i));
}
// output how many individual matches were found for this training image
cout << "good matches found for candidate image # " << (candidateImage+1) << " = " << matches_stage3.size() << endl;
I used this sites code as an example. The problem I’m having is that the feature detection is not reliable, and I seem to be missing the purpose of the NNDR ratio. I understand that I am finding K possible matches for each point within the query image and that I have K = 2. But I don’t understand the purpose of this part within the example code:
vector< DMatch > matches_stage2;
for (size_t i = 0; i < matches_stage1.size(); ++i)
{
if (matches_stage1[i].size() < 2)
continue;
const DMatch &m1 = matches_stage1[i][0];
const DMatch &m2 = matches_stage1[i][1];
if(m1.distance <= nndrRatio * m2.distance)
matches_stage2.push_back(m1);
}
Any ideas on how I can improve this further? Any advice would be appreciated as always.
First of all, let's talk about the part of the code that you don't understand. The idea is to keep only "strong matches". Actually, your call of knnMatch
finds, for each descriptor, the best two correspondences with respect to the Euclidean distance "L2"(*). This does not mean at all that these are good matches in reality, but only that those feature points are quite similar.
Let me try to explain your validation now, considering only one feature point in image A (it generalizes to all of them):
!(m1.distance <= nndrRatio * m2.distance)
), then you cannot really discriminate between them and you don't consider the match.This validation has some major weaknesses, as you have probably observed:
knnMatch
are both terribly bad, then the best of those might be accepted anyway.* EDIT: Using SIFT, you describe each feature point in your image using a floating-point vector. By computing the Euclidean distance between two vectors, you know how similar they are. If both vectors are exactly the same, then the distance is zero. The smaller the distance, the more similar the points. But this is not geometric: a point on the left-hand side of your image might look similar to a point in the right-hand side. So you first find the pairs of points that look similar (i.e. "This point in A looks similar to this point in B because the Euclidean distance between their feature vectors is small") and then you need to verify that this match is coherent (i.e. "It is possible that those similar points are actually the same because they both are on the left-hand side of my image" or "They look similar, but that is incoherent because I know that they must lie on the same side of the image and they don't").
What you do in your second stage is interesting since it considers the geometry: knowing that both images were taken from the same point (or almost the same point?), you eliminate matches that are not in the same region in both images.
The problem I see with this is that if both images weren't taken at the exact same position with the very same angle, then it won't work.
I would personally work on the second stage. Even though both images aren't necessarily exactly the same, they describe the same scene. And you can take advantage of the geometry of it.
The idea is that you should be able to find a transformation from the first image to the second one (i.e. the way in which a point moved from image A to image B is actually linked to the way all of the points moved). And in your situation, I would bet that a simple homography is adapted.
Here is my proposition:
cv::findHomography
(choose the RANSAC algorithm).findHomography
has a mask
output that will give you the inliers (i.e. the matches that were used to compute the homography transform).The inliers will most probably be good matches since there will be coherent geometrically.
EDIT: I just found an example using findHomography
here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With