I am trying to reconstruct a 3d shape from multiple 2d images. I have calculated a fundamental matrix, but now I don't know what to do with it. I am finding multiple conflicting answers on stack overflow and academic papers. For example, Here says you need to compute the rotation and translation matrices from the fundamental matrix. Here says you need to find the camera matrices. Here says you need to find the homographies. Here says you need to find the epipolar lines. Which is it?? (And how do I do it? I have read the H&Z book but I do not understand it. It says I can 'easily' use the 'direct formula' in result 9.14, but result 9.14 is neither easy nor direct to understand.) Stack overflow wants code so here's what I have so far: <pre class="prettyprint"><code> # let's create some sample data Wpts = np.array([[1, 1, 1, 1], # A Cube in world points [1, 2, 1, 1], [2, 1, 1, 1], [2, 2, 1, 1], [1, 1, 2, 1], [1, 2, 2, 1], [2, 1, 2, 1], [2, 2, 2, 1]]) Cpts = np.array([[0, 4, 0, 1], #slightly up [4, 0, 0, 1], [-4, 0, 0, 1], [0, -4, 0, 1]]) Cangles = np.array([[0, -1, 0], #slightly looking down [-1, 0, 0], [1, 0, 0], [0,1,0]]) views = [] transforms = [] clen = len(Cpts) for i in range(clen): cangle = Cangles[i] cpt = Cpts[i] transform = cameraTransformMatrix(cangle, cpt) transforms.append(transform) newpts = np.dot(Wpts, transform.T) view = cameraView(newpts) views.append(view) H = cv2.findFundamentalMat(views[0], views[1])[0] ## now what??? How do I recover the cube shape? </code></pre> Edit: I do not know the camera parameters

<h3>Fundamental Matrix</h3> At first, listen to the fundamental matrix song ;). The Fundamental Matrix only shows the mathematical relationship between your point correspondences in 2 images (x' - image 2, x - image 1). "That means, for all pairs of corresponding points holds <img src="https://chart.googleapis.com/chart?cht=tx&chl=x%27%5E%7BT%7DFx%3D%200" alt="eq1"> " (Wikipedia). This also means, that if you are having outlier or incorrect point correspondences, it directly affects the quality of your fundamental matrix. Additionally, a similar structure exists for the relationship of point correspondences between 3 images which is called Trifocal Tensor. A 3d reconstruction using exclusively the properties of the Fundamental Matrix is not possible because "The epipolar geometry is the intrinsic projective geometry between two views. It is independent of scene structure, and only depends on the cameras’ internal parameters and relative pose." (HZ, p.239). <h3>Camera matrix</h3> Refering to your question how to reconstruct the shape from multiple images you need to know the camera matrices of your images (K', K). The camera matrix is a 3x3 matrix composed of the camera focal lengths or principal distance (fx, fy) as well as the optical center or principal point (cx, cy). <hr> <img src="https://chart.googleapis.com/chart?cht=tx&chl=%20K%20%3D%20%5Cbegin%7Bpmatrix%7D%20f_x%20%26%200%20%26%20c_x%20%5C%5C%0A%090%20%26%20f_y%20%26%20c_y%5C%5C%0A%090%20%260%20%26%201%5C%5C%0A%09%5Cend%7Bpmatrix" alt="eq2"> You can derive your camera matrix using camera calibration. <h3>Essential matrix</h3> When you know your camera matrices you can extend your Fundamental Matrix to a Essential Matrix E. <hr> <img src="https://chart.googleapis.com/chart?cht=tx&chl=E%20%3D%20(K%27)%5ETFK" alt="eq3"> You could say quite sloppy that your Fundamental Matrix is now "calibrated". The Essential Matrix can be used to get the rotation (rotation matrix R) and translation (vector t) of your second image in comparison to your first image only up to a projective reconstruction. t will be a unit vector. For this purpose you can use the OpenCV functions <code>decomposeEssentialMat</code> or <code>recoverPose</code> (that uses the cheirality check) or read further detailed explanations in HZ. <h3>Projection matrix</h3> Knowing your translation and rotation you can build you projection matrices for your images. The projection matrix is defined as <img src="https://chart.googleapis.com/chart?cht=tx&chl=P%20%3D%20K%20%5BR%7Ct%5D" alt="eq4">. Finally, you can use triangulation (<code>triangulatePoints</code>) to derive the 3d coordinates of your image points. I recommend using a subsequent bundle adjustment to receive a proper configuration. There is also a sfm module in openCV. Since homography or epipolar line knowledge is not essentially necessary for the 3d reconstruction I did not explain these concepts.

What do I do with the fundamental matrix?

I am trying to reconstruct a 3d shape from multiple 2d images. I have calculated a fundamental matrix, but now I don't know what to do with it.

I am finding multiple conflicting answers on stack overflow and academic papers. For example, Here says you need to compute the rotation and translation matrices from the fundamental matrix.

Here says you need to find the camera matrices.

Here says you need to find the homographies.

Here says you need to find the epipolar lines.

Which is it?? (And how do I do it? I have read the H&Z book but I do not understand it. It says I can 'easily' use the 'direct formula' in result 9.14, but result 9.14 is neither easy nor direct to understand.)

Stack overflow wants code so here's what I have so far:

    # let's create some sample data

    Wpts = np.array([[1, 1, 1, 1],  # A Cube in world points
                     [1, 2, 1, 1],
                     [2, 1, 1, 1],
                     [2, 2, 1, 1],
                     [1, 1, 2, 1],
                     [1, 2, 2, 1],
                     [2, 1, 2, 1],
                     [2, 2, 2, 1]])


    Cpts = np.array([[0, 4, 0, 1],  #slightly up
                     [4, 0, 0, 1],
                     [-4, 0, 0, 1],
                     [0, -4, 0, 1]])
    Cangles = np.array([[0, -1, 0],  #slightly looking down
                        [-1, 0, 0],
                        [1, 0, 0],
                        [0,1,0]])



    views = []
    transforms = []
    clen = len(Cpts)
    for i in range(clen):
        cangle = Cangles[i]
        cpt = Cpts[i]

        transform = cameraTransformMatrix(cangle, cpt)
        transforms.append(transform)
        newpts = np.dot(Wpts, transform.T)
        view = cameraView(newpts)
        views.append(view)



H = cv2.findFundamentalMat(views[0], views[1])[0]
## now what???  How do I recover the cube shape?

Edit: I do not know the camera parameters

What does the fundamental matrix tell you?

The essential and fundamental matrices are 3x3 matrices that “encode” the epipolar geometry of two views. Motivation: Given a point in one image, multiplying by the essential/fundamental matrix will tell us which epipolar line to search along in the second view.

What is the role of fundamental matrix in epipolar geometry?

The epipolar geometry is the intrinsic projective geometry between two views. It is independent of scene structure, and only depends on the cameras' internal param- eters and relative pose. The fundamental matrix F encapsulates this intrinsic geometry.

What is the difference between essential matrix and fundamental matrix?

Thus both the Essential and Fundamental matrices completely describe the geometric relationship between corresponding points of a stereo pair of cameras. The only difference between the two is that the former deals with calibrated cameras, while the latter deals with uncalibrated cameras.

Fundamental Matrix

At first, listen to the fundamental matrix song ;).

The Fundamental Matrix only shows the mathematical relationship between your point correspondences in 2 images (x' - image 2, x - image 1). "That means, for all pairs of corresponding points holds $x'^{T}Fx= 0$ " (Wikipedia). This also means, that if you are having outlier or incorrect point correspondences, it directly affects the quality of your fundamental matrix.

Additionally, a similar structure exists for the relationship of point correspondences between 3 images which is called Trifocal Tensor.

A 3d reconstruction using exclusively the properties of the Fundamental Matrix is not possible because "The epipolar geometry is the intrinsic projective geometry between two views. It is independent of scene structure, and only depends on the cameras’ internal parameters and relative pose." (HZ, p.239).

Camera matrix

Refering to your question how to reconstruct the shape from multiple images you need to know the camera matrices of your images (K', K). The camera matrix is a 3x3 matrix composed of the camera focal lengths or principal distance (fx, fy) as well as the optical center or principal point (cx, cy).

$K = \begin{pmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y\\ 0 &0 & 1\\ \end{pmatrix$

You can derive your camera matrix using camera calibration.

Essential matrix

When you know your camera matrices you can extend your Fundamental Matrix to a Essential Matrix E.

$E = (K')^TFK$

You could say quite sloppy that your Fundamental Matrix is now "calibrated".

The Essential Matrix can be used to get the rotation (rotation matrix R) and translation (vector t) of your second image in comparison to your first image only up to a projective reconstruction. t will be a unit vector. For this purpose you can use the OpenCV functions decomposeEssentialMat or recoverPose (that uses the cheirality check) or read further detailed explanations in HZ.

Projection matrix

Knowing your translation and rotation you can build you projection matrices for your images. The projection matrix is defined as $P = K [R|t]$ . Finally, you can use triangulation (triangulatePoints) to derive the 3d coordinates of your image points. I recommend using a subsequent bundle adjustment to receive a proper configuration. There is also a sfm module in openCV.

Since homography or epipolar line knowledge is not essentially necessary for the 3d reconstruction I did not explain these concepts.

What do I do with the fundamental matrix?

Tags:

python

opencv

matrix

3d

fundamental-matrix

john ktejik

People also ask