I'm trying to estimate the relative camera pose using OpenCV. Cameras in my case are calibrated (i know the intrinsic parameters of the camera).
Given the images captured at two positions, i need to find out the relative rotation and translation between two cameras. Typical translation is about 5 to 15 meters and yaw angle rotation between cameras range between 0 - 20 degrees.
For achieving this, following steps are adopted.
E = K'FK
and modifying E for singularity constraintR = UWVt
or R = UW'Vt
(U and Vt are obtained SVD of E)For real data experiment, I captured images by mounting a camera on a tripod. Images captured at Position 1, then moved to another aligned Position and changed yaw angles in steps of 5 degrees and captured images for Position 2.
Problems/Issues:
The behavior is same on simulated data also.
Have anybody experienced similar problems as me? Have any clue on how to resolve them. Any help from anybody would be highly appreciated.
(I know there are already so many posts on similar problems, going trough all of them has not saved me. Hence posting one more time.)
Code for Human Pose Estimation in OpenCV These outputs can be used to find the pose for every person in a frame if multiple people are present. We will cover the multiple-person case in a future post. First, download the code and model files from below. There are separate files for Image and Video inputs.
Human Pose Estimation (HPE) is a way of identifying and classifying the joints in the human body. Essentially it is a way to capture a set of coordinates for each joint (arm, head, torso, etc.,) which is known as a key point that can describe a pose of a person. The connection between these points is known as a pair.
Camera pose is used to describe the position and orientation of a camera in a world coordinate system, with respect to six degrees of freedom (6DoF), using different representations, e.g., a transformation matrix.
The Perspective-n-Point problem—usually referred to as PnP—is the problem of finding the relative pose between an object and a camera from a set of n pairings between 3D points of the object and their corresponding 2D projections on the focal plane, assuming that a model of the object is available.
In chapter 9.6 of Hartley and Zisserman, they point out that, for a particular essential matrix, if one camera is held in the canonical position/orientation, there are four possible solutions for the second camera matrix: [UWV' | u3], [UWV' | -u3], [UW'V' | u3], and [UW'V' | -u3].
The difference between the first and third (and second and fourth) solutions is that the orientation is rotated by 180 degrees about the line joining the two cameras, called a "twisted pair", which sounds like what you are describing.
The book says that in order to choose the correct combination of translation and orientation from the four options, you need to test a point in the scene and make sure that the point is in front of both cameras.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With