I am trying to get a global pose estimate from an image of four fiducials with known global positions using my webcam.
I have checked many stackexchange questions and a few papers and I cannot seem to get a a correct solution. The position numbers I do get out are repeatable but in no way linearly proportional to camera movement. FYI I am using C++ OpenCV 2.1.
At this link is pictured my coordinate systems and the test data used below.
% Input to solvePnP():
imagePoints = [ 481, 831; % [x, y] format
520, 504;
1114, 828;
1106, 507]
objectPoints = [0.11, 1.15, 0; % [x, y, z] format
0.11, 1.37, 0;
0.40, 1.15, 0;
0.40, 1.37, 0]
% camera intrinsics for Logitech C910
cameraMat = [1913.71011, 0.00000, 1311.03556;
0.00000, 1909.60756, 953.81594;
0.00000, 0.00000, 1.00000]
distCoeffs = [0, 0, 0, 0, 0]
% output of solvePnP():
tVec = [-0.3515;
0.8928;
0.1997]
rVec = [2.5279;
-0.09793;
0.2050]
% using Rodrigues to convert back to rotation matrix:
rMat = [0.9853, -0.1159, 0.1248;
-0.0242, -0.8206, -0.5708;
0.1686, 0.5594, -0.8114]
So far, can anyone see anything wrong with these numbers? I would appreciate it if someone would check them in for example MatLAB (code above is m-file friendly).
From this point, I am unsure of how to get the global pose from rMat and tVec. From what I have read in this question, to get the pose from rMat and tVec is simply:
position = transpose(rMat) * tVec % matrix multiplication
However I suspect from other sources that I have read it is not that simple.
To get the position of the camera in real world coordinates, what do I need to do? As I am unsure if this is an implementation problem (however most likely a theory problem) I would like for someone who has used the solvePnP function successfully in OpenCV to answer this question, although any ideas are welcome too!
Thank you very much for your time.
Code for Human Pose Estimation in OpenCV These outputs can be used to find the pose for every person in a frame if multiple people are present. We will cover the multiple-person case in a future post. First, download the code and model files from below. There are separate files for Image and Video inputs.
What is pose estimation? The problem of determining the position and orientation of the camera relative to the object (or vice-versa). We use the correspondences between 2D image pixels (and thus camera rays) and 3D object points (from the world) to compute the pose.
The main purpose of OpenCV is used to identify and recognize objects based on real-time images and videos. It is done by estimating the orientation and position of the object concerning the coordinate system. The PNP problem solved for OpenCV solvepnp() is actually a pose estimation problem.
The Perspective-n-Point (PnP) problem is the problem of estimating the relative pose (six degrees of freedom) between an object and the camera, given a set of correspondences between 3D points and their projections on the image plane.
I solved this a while ago, apologies for the year delay.
In the python OpenCV 2.1 I was using, and the newer version 3.0.0-dev, I have verified that to get the pose of the camera in the global frame you must:
_, rVec, tVec = cv2.solvePnP(objectPoints, imagePoints, cameraMatrix, distCoeffs)
Rt = cv2.Rodrigues(rvec)
R = Rt.transpose()
pos = -R * tVec
Now pos is the position of the camera expressed in the global frame (the same frame the objectPoints are expressed in). R is an attitude matrix DCM which is a good form to store the attitude in. If you require Euler angles then you can convert the DCM to Euler angles given an XYZ rotation sequence using:
roll = atan2(-R[2][1], R[2][2])
pitch = asin(R[2][0])
yaw = atan2(-R[1][0], R[0][0])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With