Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Camera position in world coordinate from cv::solvePnP

I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points).

I know that cv::solvePnP could help me, and after reading this and this I understand that I the outputs of solvePnP rvec and tvec are the rotation and translation of the object in camera coordinate system.

So I need to find out the camera rotation/translation in the world coordinate system.

From the links above it seems that the code is straightforward, in python:

found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs) rotM = cv2.Rodrigues(rvec)[0] cameraPosition = -np.matrix(rotM).T * np.matrix(tvec) 

I don't know python/numpy stuffs (I'm using C++) but this does not make a lot of sense to me:

  • rvec, tvec output from solvePnP are 3x1 matrix, 3 element vectors
  • cv2.Rodrigues(rvec) is a 3x3 matrix
  • cv2.Rodrigues(rvec)[0] is a 3x1 matrix, 3 element vectors
  • cameraPosition is a 3x1 * 1x3 matrix multiplication that is a.. 3x3 matrix. how can I use this in opengl with simple glTranslatef and glRotate calls?
like image 741
nkint Avatar asked Sep 05 '13 13:09

nkint


People also ask

What is the camera coordinate system?

Camera view coordinate system This is the system that has its origin on the image plane and the Z -axis perpendicular to the image plane. In PyTorch3D, we assume that +X points left, and +Y points up and +Z points out from the image plane.

How do you convert the coordinate system to the camera coordinate system?

Coordinates of point in world space are defined with respect to the world Cartesian coordinate system. The space in which points are defined with respect to the camera coordinate system. To convert points from world to camera space, we need to multiply points in world space by the inverse of the camera-to-world matrix.


1 Answers

If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm.

There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation [R|t], we have that inv([R|t]) = [R'|-R'*t], where R' is the transpose of R. So, you can code (not tested):

cv::Mat rvec, tvec; solvePnP(..., rvec, tvec, ...); // rvec is 3x1, tvec is 3x1  cv::Mat R; cv::Rodrigues(rvec, R); // R is 3x3  R = R.t();  // rotation of inverse tvec = -R * tvec; // translation of inverse  cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4 T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T  // T is a 4x4 matrix with the pose of the camera in the object frame 

Update: Later, to use T with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.

OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.

// T is your 4x4 matrix in the OpenCV frame cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame 

These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.

Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.

like image 114
svick Avatar answered Sep 19 '22 00:09

svick