<ul> <li>OpenCV => 3.2</li> <li>Operating System / Platform => Windows 64 Bit</li> <li>Compiler => Visual Studio 2015</li> </ul> I am currently working on my project which involves vehicle detection and tracking and estimating and optimizing a cuboid around the vehicle. For that I have accomplished detection and tracking of vehicles and I need to find the 3-D world coordinates of the image points of the edges of the bounding boxes of the vehicles and then estimate the world coordinates of the edges of the cuboid and the project it back to the image to display it. So, I am new to computer vision and OpenCV, but in my knowledge, I just need 4 points on the image and need to know the world coordinates of those 4 points and use solvePNP in OpenCV to get the rotation and translation vectors (I already have the camera matrix and distortion coefficients). Then, I need to use Rodrigues to transform the rotation vector into a rotation matrix and then concatenate it with the translation vector to get my extrinsic matrix and then multiply the extrinsic matrix with the camera matrix to get my projection matrix. Since my z coordinate is zero, so I need to take off the third column from the projection matrix which gives the homography matrix for converting the 2D image points to 3D world points. Now, I find the inverse of the homography matrix which gives me the homography between the 3D world points to 2D image points. After that I multiply the image points [x, y, 1]t with the inverse homography matrix to get [wX, wY, w]t and the divide the entire vector by the scalar w to get [X, Y, 1] which gives me the X and Y values of the world coordinates. My code looks like this: <pre class="prettyprint"><code>#include "opencv2/opencv.hpp" #include <stdio.h> #include <iostream> #include <sstream> #include <math.h> #include <conio.h> using namespace cv; using namespace std; Mat cameraMatrix, distCoeffs, rotationVector, rotationMatrix, translationVector,extrinsicMatrix, projectionMatrix, homographyMatrix, inverseHomographyMatrix; Point point; vector<Point2d> image_points; vector<Point3d> world_points; int main() { FileStorage fs1("intrinsics.yml", FileStorage::READ); fs1["camera_matrix"] >> cameraMatrix; cout << "Camera Matrix: " << cameraMatrix << endl << endl; fs1["distortion_coefficients"] >> distCoeffs; cout << "Distortion Coefficients: " << distCoeffs << endl << endl; image_points.push_back(Point2d(275, 204)); image_points.push_back(Point2d(331, 204)); image_points.push_back(Point2d(331, 308)); image_points.push_back(Point2d(275, 308)); cout << "Image Points: " << image_points << endl << endl; world_points.push_back(Point3d(0.0, 0.0, 0.0)); world_points.push_back(Point3d(1.775, 0.0, 0.0)); world_points.push_back(Point3d(1.775, 4.620, 0.0)); world_points.push_back(Point3d(0.0, 4.620, 0.0)); cout << "World Points: " << world_points << endl << endl; solvePnP(world_points, image_points, cameraMatrix, distCoeffs, rotationVector, translationVector); cout << "Rotation Vector: " << endl << rotationVector << endl << endl; cout << "Translation Vector: " << endl << translationVector << endl << endl; Rodrigues(rotationVector, rotationMatrix); cout << "Rotation Matrix: " << endl << rotationMatrix << endl << endl; hconcat(rotationMatrix, translationVector, extrinsicMatrix); cout << "Extrinsic Matrix: " << endl << extrinsicMatrix << endl << endl; projectionMatrix = cameraMatrix * extrinsicMatrix; cout << "Projection Matrix: " << endl << projectionMatrix << endl << endl; double p11 = projectionMatrix.at<double>(0, 0), p12 = projectionMatrix.at<double>(0, 1), p14 = projectionMatrix.at<double>(0, 3), p21 = projectionMatrix.at<double>(1, 0), p22 = projectionMatrix.at<double>(1, 1), p24 = projectionMatrix.at<double>(1, 3), p31 = projectionMatrix.at<double>(2, 0), p32 = projectionMatrix.at<double>(2, 1), p34 = projectionMatrix.at<double>(2, 3); homographyMatrix = (Mat_<double>(3, 3) << p11, p12, p14, p21, p22, p24, p31, p32, p34); cout << "Homography Matrix: " << endl << homographyMatrix << endl << endl; inverseHomographyMatrix = homographyMatrix.inv(); cout << "Inverse Homography Matrix: " << endl << inverseHomographyMatrix << endl << endl; Mat point2D = (Mat_<double>(3, 1) << image_points[0].x, image_points[0].y, 1); cout << "First Image Point" << point2D << endl << endl; Mat point3Dw = inverseHomographyMatrix*point2D; cout << "Point 3D-W : " << point3Dw << endl << endl; double w = point3Dw.at<double>(2, 0); cout << "W: " << w << endl << endl; Mat matPoint3D; divide(w, point3Dw, matPoint3D); cout << "Point 3D: " << matPoint3D << endl << endl; _getch(); return 0; </code></pre> I have got the image coordinates of the four known world points and hard-coded it for simplification. <code>image_points</code> contain the image coordinates of the four points and <code>world_points</code> contain the world coordinates of the four points. I am considering the the first world point as the origin (0, 0, 0) in the world axis and using known distance calculating the coordinates of the other four points. Now after calculating the inverse homography matrix, I multiplied it with [image_points[0].x, image_points[0].y, 1]t which is related to the world coordinate (0, 0, 0). Then I divide the result by the third component w to get [X, Y, 1]. But after printing out the values of X and Y, it turns out they are not 0, 0 respectively. What am doing wrong? The output of my code is like this: <pre class="prettyprint"><code>Camera Matrix: [517.0036881709533, 0, 320; 0, 517.0036881709533, 212; 0, 0, 1] Distortion Coefficients: [0.1128663679798094; -1.487790079922432; 0; 0; 2.300571896761067] Image Points: [275, 204; 331, 204; 331, 308; 275, 308] World Points: [0, 0, 0; 1.775, 0, 0; 1.775, 4.62, 0; 0, 4.62, 0] Rotation Vector: [0.661476468596541; -0.02794460022559267; 0.01206996342819649] Translation Vector: [-1.394495345140898; -0.2454153722672731; 15.47126945512652] Rotation Matrix: [0.9995533907649279, -0.02011656447351923, -0.02209848058392758; 0.002297501163799448, 0.7890323093017149, -0.6143474069013439; 0.02979497438726573, 0.6140222623910194, 0.7887261380159] Extrinsic Matrix: [0.9995533907649279, -0.02011656447351923, -0.02209848058392758, -1.394495345140898; 0.002297501163799448, 0.7890323093017149, -0.6143474069013439, -0.2454153722672731; 0.02979497438726573, 0.6140222623910194, 0.7887261380159, 15.47126945512652] Projection Matrix: [526.3071813531748, 186.086785938988, 240.9673682002232, 4229.846989065414; 7.504351145361707, 538.1053336219271, -150.4099339268854, 3153.028471890794; 0.02979497438726573, 0.6140222623910194, 0.7887261380159, 15.47126945512652] Homography Matrix: [526.3071813531748, 186.086785938988, 4229.846989065414; 7.504351145361707, 538.1053336219271, 3153.028471890794; 0.02979497438726573, 0.6140222623910194, 15.47126945512652] Inverse Homography Matrix: [0.001930136511648154, -8.512427241879318e-05, -0.5103513244724983; -6.693679705844383e-06, 0.00242178892313387, -0.4917279870709287 -3.451449134581896e-06, -9.595179260534558e-05, 0.08513443835773901] First Image Point[275; 204; 1] Point 3D-W : [0.003070864657310213; 0.0004761913292736786; 0.06461112415423849] W: 0.0646111 Point 3D: [21.04004290792539; 135.683117651025; 1] </code></pre>

Your reasoning is sound, but you are making some mistake in the last division.. or am I missing something? Your result before W division is: <pre class="prettyprint"><code>Point 3D-W : [0.003070864657310213; 0.0004761913292736786; 0.06461112415423849] </code></pre> Now we need to normalize this by dividing all the coordinates by W (the 3rd element of the array), as you described in your question. so: <pre class="prettyprint"><code>Point 3D-W Normalized = [0.003070864657310213 / 0.06461112415423849; 0.0004761913292736786 / 0.06461112415423849; 0.06461112415423849 / 0.06461112415423849] </code></pre> Which results in: <pre class="prettyprint"><code>Point 3D-W Normalized = [0.047528420183179314; 0.007370113668614144; 1.0] </code></pre> Which is damn close to [0,0].

Transforming 2D image coordinates to 3D world coordinates with z = 0

Tags:

c++

visual-studio

opencv

computer-vision

homography

OpenCV => 3.2
Operating System / Platform => Windows 64 Bit
Compiler => Visual Studio 2015

I am currently working on my project which involves vehicle detection and tracking and estimating and optimizing a cuboid around the vehicle. For that I have accomplished detection and tracking of vehicles and I need to find the 3-D world coordinates of the image points of the edges of the bounding boxes of the vehicles and then estimate the world coordinates of the edges of the cuboid and the project it back to the image to display it.

So, I am new to computer vision and OpenCV, but in my knowledge, I just need 4 points on the image and need to know the world coordinates of those 4 points and use solvePNP in OpenCV to get the rotation and translation vectors (I already have the camera matrix and distortion coefficients). Then, I need to use Rodrigues to transform the rotation vector into a rotation matrix and then concatenate it with the translation vector to get my extrinsic matrix and then multiply the extrinsic matrix with the camera matrix to get my projection matrix. Since my z coordinate is zero, so I need to take off the third column from the projection matrix which gives the homography matrix for converting the 2D image points to 3D world points. Now, I find the inverse of the homography matrix which gives me the homography between the 3D world points to 2D image points. After that I multiply the image points [x, y, 1]t with the inverse homography matrix to get [wX, wY, w]t and the divide the entire vector by the scalar w to get [X, Y, 1] which gives me the X and Y values of the world coordinates.

My code looks like this:

#include "opencv2/opencv.hpp"
#include <stdio.h>
#include <iostream>
#include <sstream>
#include <math.h> 
#include <conio.h>

using namespace cv;
using namespace std;

Mat cameraMatrix, distCoeffs, rotationVector, rotationMatrix, 
translationVector,extrinsicMatrix, projectionMatrix, homographyMatrix, 
inverseHomographyMatrix;


Point point;
vector<Point2d> image_points;
vector<Point3d> world_points;

int main()
{
FileStorage fs1("intrinsics.yml", FileStorage::READ);

fs1["camera_matrix"] >> cameraMatrix;
cout << "Camera Matrix: " << cameraMatrix << endl << endl;

fs1["distortion_coefficients"] >> distCoeffs;
cout << "Distortion Coefficients: " << distCoeffs << endl << endl;



image_points.push_back(Point2d(275, 204));
image_points.push_back(Point2d(331, 204));
image_points.push_back(Point2d(331, 308));
image_points.push_back(Point2d(275, 308));

cout << "Image Points: " << image_points << endl << endl;

world_points.push_back(Point3d(0.0, 0.0, 0.0));
world_points.push_back(Point3d(1.775, 0.0, 0.0));
world_points.push_back(Point3d(1.775, 4.620, 0.0));
world_points.push_back(Point3d(0.0, 4.620, 0.0));

cout << "World Points: " << world_points << endl << endl;

solvePnP(world_points, image_points, cameraMatrix, distCoeffs, rotationVector, translationVector);
cout << "Rotation Vector: " << endl << rotationVector << endl << endl;
cout << "Translation Vector: " << endl << translationVector << endl << endl;

Rodrigues(rotationVector, rotationMatrix);
cout << "Rotation Matrix: " << endl << rotationMatrix << endl << endl;

hconcat(rotationMatrix, translationVector, extrinsicMatrix);
cout << "Extrinsic Matrix: " << endl << extrinsicMatrix << endl << endl;

projectionMatrix = cameraMatrix * extrinsicMatrix;
cout << "Projection Matrix: " << endl << projectionMatrix << endl << endl;

double p11 = projectionMatrix.at<double>(0, 0),
    p12 = projectionMatrix.at<double>(0, 1),
    p14 = projectionMatrix.at<double>(0, 3),
    p21 = projectionMatrix.at<double>(1, 0),
    p22 = projectionMatrix.at<double>(1, 1),
    p24 = projectionMatrix.at<double>(1, 3),
    p31 = projectionMatrix.at<double>(2, 0),
    p32 = projectionMatrix.at<double>(2, 1),
    p34 = projectionMatrix.at<double>(2, 3);


homographyMatrix = (Mat_<double>(3, 3) << p11, p12, p14, p21, p22, p24, p31, p32, p34);
cout << "Homography Matrix: " << endl << homographyMatrix << endl << endl;

inverseHomographyMatrix = homographyMatrix.inv();
cout << "Inverse Homography Matrix: " << endl << inverseHomographyMatrix << endl << endl;

Mat point2D = (Mat_<double>(3, 1) << image_points[0].x, image_points[0].y, 1);
cout << "First Image Point" << point2D << endl << endl;

Mat point3Dw = inverseHomographyMatrix*point2D;
cout << "Point 3D-W : " << point3Dw << endl << endl;

double w = point3Dw.at<double>(2, 0);
cout << "W: " << w << endl << endl;

Mat matPoint3D;
divide(w, point3Dw, matPoint3D);

cout << "Point 3D: " << matPoint3D << endl << endl;

_getch();
return 0;

I have got the image coordinates of the four known world points and hard-coded it for simplification. image_points contain the image coordinates of the four points and world_points contain the world coordinates of the four points. I am considering the the first world point as the origin (0, 0, 0) in the world axis and using known distance calculating the coordinates of the other four points. Now after calculating the inverse homography matrix, I multiplied it with [image_points[0].x, image_points[0].y, 1]t which is related to the world coordinate (0, 0, 0). Then I divide the result by the third component w to get [X, Y, 1]. But after printing out the values of X and Y, it turns out they are not 0, 0 respectively. What am doing wrong?

The output of my code is like this:

Camera Matrix: [517.0036881709533, 0, 320;
0, 517.0036881709533, 212;
0, 0, 1]

Distortion Coefficients: [0.1128663679798094;
-1.487790079922432;
0;
0;
2.300571896761067]

Image Points: [275, 204;
331, 204;
331, 308;
275, 308]

World Points: [0, 0, 0;
1.775, 0, 0;
1.775, 4.62, 0;
0, 4.62, 0]

Rotation Vector:
[0.661476468596541;
-0.02794460022559267;
0.01206996342819649]

Translation Vector:
[-1.394495345140898;
-0.2454153722672731;
15.47126945512652]

Rotation Matrix:
[0.9995533907649279, -0.02011656447351923, -0.02209848058392758;
 0.002297501163799448, 0.7890323093017149, -0.6143474069013439;
 0.02979497438726573, 0.6140222623910194, 0.7887261380159]

Extrinsic Matrix:
[0.9995533907649279, -0.02011656447351923, -0.02209848058392758, 
-1.394495345140898;
 0.002297501163799448, 0.7890323093017149, -0.6143474069013439, 
-0.2454153722672731;
 0.02979497438726573, 0.6140222623910194, 0.7887261380159, 
15.47126945512652]

Projection Matrix:
[526.3071813531748, 186.086785938988, 240.9673682002232, 4229.846989065414;
7.504351145361707, 538.1053336219271, -150.4099339268854, 3153.028471890794;
0.02979497438726573, 0.6140222623910194, 0.7887261380159, 15.47126945512652]

Homography Matrix:
[526.3071813531748, 186.086785938988, 4229.846989065414;
7.504351145361707, 538.1053336219271, 3153.028471890794;
0.02979497438726573, 0.6140222623910194, 15.47126945512652]

Inverse Homography Matrix:
[0.001930136511648154, -8.512427241879318e-05, -0.5103513244724983;
-6.693679705844383e-06, 0.00242178892313387, -0.4917279870709287
-3.451449134581896e-06, -9.595179260534558e-05, 0.08513443835773901]

First Image Point[275;
204;
1]

Point 3D-W : [0.003070864657310213;
0.0004761913292736786;
0.06461112415423849]

W: 0.0646111
Point 3D: [21.04004290792539;
135.683117651025;
1]

286

asked May 22 '17 04:05

Indy

Video Answer

1 Answers

Your reasoning is sound, but you are making some mistake in the last division.. or am I missing something?

Your result before W division is:

Point 3D-W : 
[0.003070864657310213;
0.0004761913292736786;
0.06461112415423849]

Now we need to normalize this by dividing all the coordinates by W (the 3rd element of the array), as you described in your question. so:

Point 3D-W Normalized = 
[0.003070864657310213 / 0.06461112415423849;
0.0004761913292736786 / 0.06461112415423849;
0.06461112415423849 / 0.06461112415423849]

Which results in:

Point 3D-W Normalized = 
[0.047528420183179314;
 0.007370113668614144;
 1.0]

Which is damn close to [0,0].

109

answered Oct 19 '22 19:10

Pedro Batista

Related questions
                            
                                Access GPS on Android from C++
                            
                                Fullscreen mode on monitor A in dual-monitor setup breaks when moving windows from monitor B onto it
                            
                                struct ifreq : different definition in "linux/if.h" and man page
                            
                                Deducing const from operator T &()
                            
                                Reducing the heap size of a C++ program after large calculation
                            
                                Mac gcc doesn't allow calling std::string::~string explicitly
                            
                                C++ volatile object, nonvolatile member
                            
                                C++ specialize a lambda for a certain type [duplicate]
                            
                                Searching for a string in an input stream
                            
                                Why must operator[] be a non static member function? [duplicate]
                            
                                Volatile not working as expected
                            
                                Forward-declaration of a `constexpr` function inside another function -- Compiler bug?
                            
                                C++ force unloading shared library
                            
                                C++ formatting in netbeans for multi-conditional if statements
                            
                                No server OnAccept notification when doing client Connect a second time
                            
                                c++ operator string and char* causing ambiguous error
                            
                                Why does GCC 6 assume data is 16-byte aligned?
                            
                                Using OpenMP 3/4 in Visual Studio 2017
                            
                                Value category of const int variable captured by lambda
                            
                                DLL function not working in a VBA environment but working in Excel VBA

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With