Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

3d reconstruction from 2 images without info about the camera

I'm new in this field and I'm trying to model a simple scene in 3d out of 2d images and I dont have any info about cameras. I know that there are 3 options:

  • I have two images and I know the model of my camera (intrisics) that I loaded from a XML for instance loadXMLFromFile() => stereoRectify() => reprojectImageTo3D()

  • I don't have them but I can calibrate my camera => stereoCalibrate() => stereoRectify() => reprojectImageTo3D()

  • I can't calibrate the camera (it is my case, because I don't have the camera that has taken the 2 images, then I need to find pair keypoints on both images with SURF, SIFT for instance (I can use any blob detector actually), then compute descriptors of these keypoints, then match keypoints from image right and image left according to their descriptors, and then find the fundamental matrix from them. The processing is much harder and would be like this:

    1. detect keypoints (SURF, SIFT) =>
    2. extract descriptors (SURF,SIFT) =>
    3. compare and match descriptors (BruteForce, Flann based approaches) =>
    4. find fundamental mat (findFundamentalMat()) from these pairs =>
    5. stereoRectifyUncalibrated() =>
    6. reprojectImageTo3D()

I'm using the last approach and my questions are:

1) Is it right?

2) if it's ok, I have a doubt about the last step stereoRectifyUncalibrated() => reprojectImageTo3D(). The signature of reprojectImageTo3D() function is:

void reprojectImageTo3D(InputArray disparity, OutputArray _3dImage, InputArray Q, bool handleMissingValues=false, int depth=-1 )  cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true) (in my code) 

Parameters:

  • disparity – Input single-channel 8-bit unsigned, 16-bit signed, 32-bit signed or 32-bit floating-point disparity image.
  • _3dImage – Output 3-channel floating-point image of the same size as disparity. Each element of _3dImage(x,y) contains 3D coordinates of the point (x,y) computed from the disparity map.
  • Q – 4x4 perspective transformation matrix that can be obtained with stereoRectify().
  • handleMissingValues – Indicates, whether the function should handle missing values (i.e. points where the disparity was not computed). If handleMissingValues=true, then pixels with the minimal disparity that corresponds to the outliers (see StereoBM::operator()) are transformed to 3D points with a very large Z value (currently set to 10000).
  • ddepth – The optional output array depth. If it is -1, the output image will have CV_32F depth. ddepth can also be set to CV_16S, CV_32S or `CV_32F'.

How can I get the Q matrix? Is possible to obtain the Q matrix with F, H1 and H2 or in another way?

3) Is there another way for obtain the xyz coordinates without calibrating the cameras?

My code is:

#include <opencv2/core/core.hpp> #include <opencv2/calib3d/calib3d.hpp> #include <opencv2/imgproc/imgproc.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/contrib/contrib.hpp> #include <opencv2/features2d/features2d.hpp> #include <stdio.h> #include <iostream> #include <vector> #include <conio.h> #include <opencv/cv.h> #include <opencv/cxcore.h> #include <opencv/cvaux.h>   using namespace cv; using namespace std;  int main(int argc, char *argv[]){      // Read the images     Mat imgLeft = imread( argv[1], CV_LOAD_IMAGE_GRAYSCALE );     Mat imgRight = imread( argv[2], CV_LOAD_IMAGE_GRAYSCALE );      // check     if (!imgLeft.data || !imgRight.data)             return 0;      // 1] find pair keypoints on both images (SURF, SIFT):::::::::::::::::::::::::::::      // vector of keypoints     std::vector<cv::KeyPoint> keypointsLeft;     std::vector<cv::KeyPoint> keypointsRight;      // Construct the SURF feature detector object     cv::SiftFeatureDetector sift(             0.01, // feature threshold             10); // threshold to reduce                 // sensitivity to lines                 // Detect the SURF features      // Detection of the SIFT features     sift.detect(imgLeft,keypointsLeft);     sift.detect(imgRight,keypointsRight);      std::cout << "Number of SURF points (1): " << keypointsLeft.size() << std::endl;     std::cout << "Number of SURF points (2): " << keypointsRight.size() << std::endl;      // 2] compute descriptors of these keypoints (SURF,SIFT) ::::::::::::::::::::::::::      // Construction of the SURF descriptor extractor     cv::SurfDescriptorExtractor surfDesc;      // Extraction of the SURF descriptors     cv::Mat descriptorsLeft, descriptorsRight;     surfDesc.compute(imgLeft,keypointsLeft,descriptorsLeft);     surfDesc.compute(imgRight,keypointsRight,descriptorsRight);      std::cout << "descriptor matrix size: " << descriptorsLeft.rows << " by " << descriptorsLeft.cols << std::endl;      // 3] matching keypoints from image right and image left according to their descriptors (BruteForce, Flann based approaches)      // Construction of the matcher     cv::BruteForceMatcher<cv::L2<float> > matcher;      // Match the two image descriptors     std::vector<cv::DMatch> matches;     matcher.match(descriptorsLeft,descriptorsRight, matches);      std::cout << "Number of matched points: " << matches.size() << std::endl;       // 4] find the fundamental mat ::::::::::::::::::::::::::::::::::::::::::::::::::::      // Convert 1 vector of keypoints into     // 2 vectors of Point2f for compute F matrix     // with cv::findFundamentalMat() function     std::vector<int> pointIndexesLeft;     std::vector<int> pointIndexesRight;     for (std::vector<cv::DMatch>::const_iterator it= matches.begin(); it!= matches.end(); ++it) {           // Get the indexes of the selected matched keypoints          pointIndexesLeft.push_back(it->queryIdx);          pointIndexesRight.push_back(it->trainIdx);     }      // Convert keypoints into Point2f     std::vector<cv::Point2f> selPointsLeft, selPointsRight;     cv::KeyPoint::convert(keypointsLeft,selPointsLeft,pointIndexesLeft);     cv::KeyPoint::convert(keypointsRight,selPointsRight,pointIndexesRight);      /* check by drawing the points     std::vector<cv::Point2f>::const_iterator it= selPointsLeft.begin();     while (it!=selPointsLeft.end()) {              // draw a circle at each corner location             cv::circle(imgLeft,*it,3,cv::Scalar(255,255,255),2);             ++it;     }      it= selPointsRight.begin();     while (it!=selPointsRight.end()) {              // draw a circle at each corner location             cv::circle(imgRight,*it,3,cv::Scalar(255,255,255),2);             ++it;     } */      // Compute F matrix from n>=8 matches     cv::Mat fundemental= cv::findFundamentalMat(             cv::Mat(selPointsLeft), // points in first image             cv::Mat(selPointsRight), // points in second image             CV_FM_RANSAC);       // 8-point method      std::cout << "F-Matrix size= " << fundemental.rows << "," << fundemental.cols << std::endl;      /* draw the left points corresponding epipolar lines in right image     std::vector<cv::Vec3f> linesLeft;     cv::computeCorrespondEpilines(             cv::Mat(selPointsLeft), // image points             1,                      // in image 1 (can also be 2)             fundemental,            // F matrix             linesLeft);             // vector of epipolar lines      // for all epipolar lines     for (vector<cv::Vec3f>::const_iterator it= linesLeft.begin(); it!=linesLeft.end(); ++it) {          // draw the epipolar line between first and last column         cv::line(imgRight,cv::Point(0,-(*it)[2]/(*it)[1]),cv::Point(imgRight.cols,-((*it)[2]+(*it)[0]*imgRight.cols)/(*it)[1]),cv::Scalar(255,255,255));     }      // draw the left points corresponding epipolar lines in left image     std::vector<cv::Vec3f> linesRight;     cv::computeCorrespondEpilines(cv::Mat(selPointsRight),2,fundemental,linesRight);     for (vector<cv::Vec3f>::const_iterator it= linesRight.begin(); it!=linesRight.end(); ++it) {          // draw the epipolar line between first and last column         cv::line(imgLeft,cv::Point(0,-(*it)[2]/(*it)[1]), cv::Point(imgLeft.cols,-((*it)[2]+(*it)[0]*imgLeft.cols)/(*it)[1]), cv::Scalar(255,255,255));     }      // Display the images with points and epipolar lines     cv::namedWindow("Right Image Epilines");     cv::imshow("Right Image Epilines",imgRight);     cv::namedWindow("Left Image Epilines");     cv::imshow("Left Image Epilines",imgLeft);     */      // 5] stereoRectifyUncalibrated()::::::::::::::::::::::::::::::::::::::::::::::::::      //H1, H2 – The output rectification homography matrices for the first and for the second images.     cv::Mat H1(4,4, imgRight.type());     cv::Mat H2(4,4, imgRight.type());     cv::stereoRectifyUncalibrated(selPointsRight, selPointsLeft, fundemental, imgRight.size(), H1, H2);       // create the image in which we will save our disparities     Mat imgDisparity16S = Mat( imgLeft.rows, imgLeft.cols, CV_16S );     Mat imgDisparity8U = Mat( imgLeft.rows, imgLeft.cols, CV_8UC1 );      // Call the constructor for StereoBM     int ndisparities = 16*5;      // < Range of disparity >     int SADWindowSize = 5;        // < Size of the block window > Must be odd. Is the                                    // size of averaging window used to match pixel                                     // blocks(larger values mean better robustness to                                   // noise, but yield blurry disparity maps)      StereoBM sbm( StereoBM::BASIC_PRESET,         ndisparities,         SADWindowSize );      // Calculate the disparity image     sbm( imgLeft, imgRight, imgDisparity16S, CV_16S );      // Check its extreme values     double minVal; double maxVal;      minMaxLoc( imgDisparity16S, &minVal, &maxVal );      printf("Min disp: %f Max value: %f \n", minVal, maxVal);      // Display it as a CV_8UC1 image     imgDisparity16S.convertTo( imgDisparity8U, CV_8UC1, 255/(maxVal - minVal));      namedWindow( "windowDisparity", CV_WINDOW_NORMAL );     imshow( "windowDisparity", imgDisparity8U );       // 6] reprojectImageTo3D() :::::::::::::::::::::::::::::::::::::::::::::::::::::      //Mat xyz;     //cv::reprojectImageTo3D(imgDisparity8U, xyz, Q, true);      //How can I get the Q matrix? Is possibile to obtain the Q matrix with      //F, H1 and H2 or in another way?     //Is there another way for obtain the xyz coordinates?      cv::waitKey();     return 0; } 
like image 464
Fobi Avatar asked Jan 26 '12 22:01

Fobi


People also ask

What is image based 3D reconstruction?

The goal of image-based 3D reconstruction is to infer the 3D geometry and structure of objects and scenes from one or multiple 2D images.

How many cameras are required to find 3D points using image coordinates?

Stereo images mean that there are two cameras and 2 images required to calculate a point in 3D space. Essentially the pixels from one image are matched with pixels of the second and epipolar geometry is used to calculate that same point in 3D space.

What is 3D scene reconstruction?

Introduction. 3D reconstruction from multiple images is the creation of three-dimensional models from a set of images. It is the reverse process of obtaining 2D images from 3D scenes.


2 Answers

StereoRectifyUncalibrated calculates simply planar perspective transformation not rectification transformation in object space. It is necessary to convert this planar transformation to object space transformation to extract Q matrice, and i think some of the camera calibration parameters are required for it( like camera intrinsics ). There may have some research topics ongoing with this subject.

You may have add some steps for estimating camera intrinsics, and extracting relative orientation of cameras to make your flow work right. I think camera calibration parameters are vital for extracting proper 3d structure of the scene, if there is no active lighting method is used.

Also bundle block adjustment based solutions are required for refining all estimated values to more accurate values.

like image 143
AGP Avatar answered Oct 02 '22 17:10

AGP


  1. the procedure looks OK to me .

  2. as far as I know, regarding Image based 3D modelling, cameras are explicitly calibrated or implicitly calibrated. you don't want to explicitly calibrating the camera. you will make use of those things anyway. matching corresponding point pairs are definitely a heavily used approach.

like image 44
zinking Avatar answered Oct 02 '22 17:10

zinking