Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Camera pose estimation: How do I interpret rotation and translation matrices?

Assume I have good correspondences between two images and attempt to recover the camera motion between them. I can use OpenCV 3's new facilities for this, like this:

 Mat E = findEssentialMat(imgpts1, imgpts2, focal, principalPoint, RANSAC, 0.999, 1, mask);

 int inliers = recoverPose(E, imgpts1, imgpts2, R, t, focal, principalPoint, mask);

 Mat mtxR, mtxQ;
 Mat Qx, Qy, Qz;
 Vec3d angles = RQDecomp3x3(R, mtxR, mtxQ, Qx, Qy, Qz);

 cout << "Translation: " << t.t() << endl;
 cout << "Euler angles [x y z] in degrees: " << angles.t() << endl;

Now, I have trouble wrapping my head around what R and t actually mean. Are they the transform needed to map coordinates from camera space 1 to camera space two, as in p_2 = R * p_1 + t?

Consider this example, with ground-truth manually labeled correspondences

enter image description here

The output I get is this:

Translation: [-0.9661243151855488, -0.04921320381132761, 0.253341406362796]
Euler angles [x y z] in degrees: [9.780449804801876, 46.49315494782735, 15.66510133665445]

I try to match this to what I see in the image and come up with the interpretation, that [-0.96,-0.04,0.25] tells me, I have moved to the right, as the coordinates have moved along the negative x-Axis, but it would also tell me, I have moved further away, as the coordinates have moved along the positive z-Axis.

I have also rotated the camera around the y-Axis (to the left, which I think would be a counter-clockwise rotation around the negative y-Axis because in OpenCV, the y-Axis points downwards, does it not?)

Question: Is my interpretation correct and if no, what is the correct one?

like image 833
oarfish Avatar asked Jul 16 '15 06:07

oarfish


1 Answers

Actually your interpretation is correct.

First of all you are correct, about the orientation of the y axis. For an illustration of the camera coordinate system of OpenCV see here.

Your code will return the R and t from the second to the first camera. This mean if x1 is a point in the first image and x2 is a point in the second image, the following equation holds x1 = R*x2 + t. Now in your case the right image (front view) is from camera 1 and the left image (side view) of the car from camera 2.

Looking at this equation we, see that first of all the rotation is applied. So image your camera currently pictures the left frame. Now your R specifies a rotation of about 46 degrees around the y axis. As rotation points by angle alpha is the same as counterrotating the coordinate axis by this angle, your R tells you to rotate left. As you yourself point out this seems correct if looking at the pictures. As the rotations around the other axes are small and hard to image, lets omit them here. So after applying the rotation you are still standing at the same position the left frame was taken from but your camera more or less points to the back of the car or the space directly behind the car.

Now let us look at the translation vector. Your interpretation about moving to the right and further away is correct as well. Let me try to explain why. Imagine from your current position, with the new camera direction you only move to the right. you would directly bump into the car or need to hold the camera above its engine hood. So after moving to the right you also need to move further away to reach the position you took the right picture from.

I hope this explanation helped you to imagine the movement your R and t describe.

like image 118
Ann Avatar answered Oct 11 '22 10:10

Ann