Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calibrate camera focal length, translation and rotation given four points?

I'm trying to find the focal length, position and orientation of a camera in world space.

Because I need this to be resolution-independent, I normalized my image coordinates to be in the range [-1, 1] for x, and a somewhat smaller range for y (depending on aspect ratio). So (0, 0) is the center of the image. I've already corrected for lens distortion (using k1 and k2 coefficients), so this does not enter the picture, except sometimes throwing x or y slightly out of the [-1, 1] range.

As a given, I have a planar, fixed rectangle in world space of known dimensions (in millimeters). The four corners of the rectangle are guaranteed to be visible, and are manually marked in the image. For example:

std::vector<cv::Point3f> worldPoints = {
    cv::Point3f(0, 0, 0),
    cv::Point3f(2000, 0, 0),
    cv::Point3f(0, 3000, 0),
    cv::Point3f(2000, 3000, 0),
};
std::vector<cv::Point2f> imagePoints = {
    cv::Point2f(-0.958707, -0.219624),
    cv::Point2f(-1.22234, 0.577061),
    cv::Point2f(0.0837469, -0.1783),
    cv::Point2f(0.205473, 0.428184),
};

Effectively, the equation I think I'm trying to solve is (see the equivalent in the OpenCV documentation):

  / xi \   / fx    0 \ /        tx \ / Xi \
s | yi | = |    fy 0 | |  Rxyz  ty | | Yi |
  \ 1  /   \       1 / \        tz / | Zi |
                                     \ 1  /

where:

  • i is 1, 2, 3, 4
  • xi, yi is the location of point i in the image (between -1 and 1)
  • fx, fy are the focal lengths of the camera in x and y direction
  • Rxyz is the 3x3 rotation matrix of the camera (has only 3 degrees of freedom)
  • tx, ty, tz is the translation of the camera
  • Xi, Yi, Zi is the location of point i in world space (millimeters)

So I have 8 equations (4 points of 2 coordinates each), and I have 8 unknowns (fx, fy, Rxyz, tx, ty, tz). Therefore, I conclude (barring pathological cases) that a unique solution must exist.

However, I can't seem to figure out how to compute this solution using OpenCV.

I have looked at the imgproc module:

  • getPerspectiveTransform works, but gives me a 3x3 matrix only (from 2D points to 2D points). If I could somehow extract the needed parameters from this matrix, that would be great.

I have also looked at the calib3d module, which contains a few promising functions that do almost, but not quite, what I need:

  • initCameraMatrix2D sounds almost perfect, but when I pass it my four points like this:

    cv::Mat cameraMatrix = cv::initCameraMatrix2D(
                std::vector<std::vector<cv::Point3f>>({worldPoints}),
                std::vector<std::vector<cv::Point2f>>({imagePoints}),
                cv::Size2f(2, 2), -1);
    

    it returns me a camera matrix that has fx, fy set to -inf, inf.

  • calibrateCamera seems to use a complicated solver to deal with overdetermined systems and outliers. I tried it anyway, but all I can get from it are assertion failures like this:

    OpenCV(3.4.1) Error: Assertion failed (0 <= i && i < (int)vv.size()) in getMat_, file /build/opencv/src/opencv-3.4.1/modules/core/src/matrix_wrap.cpp, line 79
    

Is there a way to entice OpenCV to do what I need? And if not, how could I do it by hand?

like image 618
Thomas Avatar asked Jun 01 '18 09:06

Thomas


People also ask

What is the focal length of the camera?

You already know the focal lengths is 20mm, but you might want to compare/check calibration results. In camera matrix the focal lengths fx,fy are expressed in pixel units.

How are the camera parameters calculated during calibration?

In the process of calibration we calculate the camera parameters by a set of know 3D points and their corresponding pixel location in the image. For the 3D points we photograph a checkerboard pattern with known dimensions at many different orientations.

How to convert focal length to pixel size in camera matrix?

In camera matrix the focal lengths fx,fy are expressed in pixel units. To convert focals in World units Fx, Fy you need sensor size in same units using similar triangle. Fx = fx * W /w or Fy = fy * H /h. where: W: is the sensor width expressed in world units, let's say mm. w: is the image width expressed in pixel.

How to convert focal length to FX/FY?

You already know the focal lengths is 20mm, but you might want to compare/check calibration results. In camera matrix the focal lengths fx,fy are expressed in pixel units. To convert focals in World units Fx, Fy you need sensor size in same units using similar triangle


1 Answers

3x3 rotation matrices have 9 elements but, as you said, only 3 degrees of freedom. One subtlety is that exploiting this property makes the equation non-linear in the angles you want to estimate, and non-linear equations are harder to solve than linear ones.

This kind of equations are usually solved by:

  1. considering that the P=K.[R | t] matrix has 12 degrees of freedom and solving the resulting linear equation using the SVD decomposition (see Section 7.1 of 'Multiple View Geometry' by Hartley & Zisserman for more details)

  2. decomposing this intermediate result into an initial approximate solution to your non-linear equation (see for example cv::decomposeProjectionMatrix)

  3. refining the approximate solution using an iterative solver which is able to deal with non-linear equations and with the reduced degrees of freedom of the rotation matrix (e.g. Levenberg-Marquard algorithm). I am not sure if there is a generic implementation of this in OpenCV, however it is not too complicated to implement one yourself using the Ceres Solver library.

However, your case is a bit particular because you do not have enough point matches to solve the linear formulation (i.e. step 1) reliably. This means that, as you stated it, you have no way to initialize an iterative refining algorithm to get an accurate solution to your problem.

Here are a few work-arounds that you can try:

  • somehow get 2 additional point matches, leading to a total of 6 matches hence 12 constraints on your linear equation, allowing you to solve the problem using the steps 1, 2, 3 above.

  • somehow guess manually an initial estimate for your 8 parameters (2 focal lengths, 3 angles & 3 translations), and directly refine them using an iterative solver. Be aware that the iterative process might converge to a wrong solution if your initial estimate is too far off.

  • reduce the number of unknowns in your model. For instance, if you manage to fix two of the three angles (e.g. roll & pitch) the equations might simplify a lot. Also, the two focal lengths are probably related via the aspect ratio, so if you know it and if your pixels are square, then you actually have a single unknown there.

  • if all else fails, there might be a way to extract approximated values from the rectifying homography estimated by cv::getPerspectiveTransform.


Regarding the last bullet point, the opposite of what you want is clearly possible. Indeed, the rectifying homography can be expressed analytically knowing the parameters you want to estimate. See for instance this post and this post. There is also a full chapter on this in the Hartley & Zisserman book (chapter 13).

In your case, you want to go the other way around, i.e. to extract the intrinsic & extrinsic parameters from the homography. There is a somewhat related function in OpenCV (cv::decomposeHomographyMat), but it assumes the K matrix is known and it outputs 4 candidate solutions.

In the general case, this would be tricky. But maybe in your case you can guess a reasonable estimate for the focal length, hence for K, and use the point correspondences to select the good solution to your problem. You might also implement a custom optimization algorithm, testing many focal length values and keeping the solution leading to the lowest reprojection error.

like image 126
BConic Avatar answered Nov 02 '22 23:11

BConic