I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points). I know that <code>cv::solvePnP</code> could help me, and after reading this and this I understand that I the outputs of solvePnP <code>rvec</code> and <code>tvec</code> are the rotation and translation of the object in camera coordinate system. So I need to find out the camera rotation/translation in the world coordinate system. From the links above it seems that the code is straightforward, in python: <pre class="prettyprint"><code>found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs) rotM = cv2.Rodrigues(rvec)[0] cameraPosition = -np.matrix(rotM).T * np.matrix(tvec) </code></pre> I don't know python/numpy stuffs (I'm using C++) but this does not make a lot of sense to me: <ul> <li>rvec, tvec output from solvePnP are 3x1 matrix, 3 element vectors</li> <li>cv2.Rodrigues(rvec) is a 3x3 matrix</li> <li>cv2.Rodrigues(rvec)[0] is a 3x1 matrix, 3 element vectors</li> <li>cameraPosition is a 3x1 * 1x3 matrix multiplication that is a.. 3x3 matrix. how can I use this in opengl with simple <code>glTranslatef</code> and <code>glRotate</code> calls?</li> </ul>

If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm. There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation <code>[R|t]</code>, we have that <code>inv([R|t]) = [R'|-R'*t]</code>, where <code>R'</code> is the transpose of <code>R</code>. So, you can code (not tested): <pre class="prettyprint"><code>cv::Mat rvec, tvec; solvePnP(..., rvec, tvec, ...); // rvec is 3x1, tvec is 3x1 cv::Mat R; cv::Rodrigues(rvec, R); // R is 3x3 R = R.t(); // rotation of inverse tvec = -R * tvec; // translation of inverse cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4 T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T // T is a 4x4 matrix with the pose of the camera in the object frame </code></pre> Update: Later, to use <code>T</code> with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL. OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia. <pre class="prettyprint"><code>// T is your 4x4 matrix in the OpenCV frame cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame </code></pre> These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt. Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.

Camera position in world coordinate from cv::solvePnP

Tags:

c++

opencv

computer-vision

opengl

pose-estimation

I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points).

I know that cv::solvePnP could help me, and after reading this and this I understand that I the outputs of solvePnP rvec and tvec are the rotation and translation of the object in camera coordinate system.

So I need to find out the camera rotation/translation in the world coordinate system.

From the links above it seems that the code is straightforward, in python:

found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs) rotM = cv2.Rodrigues(rvec)[0] cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)

I don't know python/numpy stuffs (I'm using C++) but this does not make a lot of sense to me:

rvec, tvec output from solvePnP are 3x1 matrix, 3 element vectors
cv2.Rodrigues(rvec) is a 3x3 matrix
cv2.Rodrigues(rvec)[0] is a 3x1 matrix, 3 element vectors
cameraPosition is a 3x1 * 1x3 matrix multiplication that is a.. 3x3 matrix. how can I use this in opengl with simple glTranslatef and glRotate calls?

741

asked Sep 05 '13 13:09

nkint

1 Answers

If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm.

There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation [R|t], we have that inv([R|t]) = [R'|-R'*t], where R' is the transpose of R. So, you can code (not tested):

cv::Mat rvec, tvec; solvePnP(..., rvec, tvec, ...); // rvec is 3x1, tvec is 3x1  cv::Mat R; cv::Rodrigues(rvec, R); // R is 3x3  R = R.t();  // rotation of inverse tvec = -R * tvec; // translation of inverse  cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4 T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T  // T is a 4x4 matrix with the pose of the camera in the object frame

Update: Later, to use T with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.

OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.

// T is your 4x4 matrix in the OpenCV frame cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame

These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.

Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.

114

answered Sep 19 '22 00:09

svick

Related questions
                            
                                Passing a string literal as a type argument to a class template
                            
                                Passing as const and by reference - Worth it? [duplicate]
                            
                                What changes introduced in C++14 can potentially break a program written in C++11?
                            
                                Why is there a significant difference in this C++ for loop's execution time? [duplicate]
                            
                                Binary literals?
                            
                                Playing a custom avi data stream using QtMultimedia
                            
                                C++ logging framework suggestions [closed]
                            
                                Why do compilers duplicate some instructions?
                            
                                C++ switch statement expression evaluation guarantee
                            
                                What does `std::kill_dependency` do, and why would I want to use it?
                            
                                Removing watermark out of an image using OpenCV
                            
                                OpenMP: are local variables automatically private?
                            
                                The precision of std::to_string(double)
                            
                                Whats the difference between UInt8 and uint8_t
                            
                                error: member access into incomplete type : forward declaration of
                            
                                how portable is end iterator decrement?
                            
                                How do files get into the External Dependencies in Visual Studio C++?
                            
                                Why not non-const reference to temporary objects? [duplicate]
                            
                                Why is statically linking glibc discouraged?
                            
                                Branchless K-means (or other optimizations)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With