I wrote a simple script to project 3D points into an image bases on the camera intrinsics and extrintics. But when I have a camera at the origin pointing down the z-axis and a 3D points further down the z-axis it appears to be behind the camera instead of in front of it. Here's my script, I've checked it so many times.
import numpy as np
def project(point, P):
Hp = P.dot(point)
if Hp[-1] < 0:
print 'Point is behind camera'
Hp = Hp / Hp[-1]
print Hp[0][0], 'x', Hp[1][0]
return Hp[0][0], Hp[1][0]
if __name__ == '__main__':
# Rc and C are the camera orientation and location in world coordinates
# Camera posed at origin pointed down the negative z-axis
Rc = np.eye(3)
C = np.array([0, 0, 0])
# Camera extrinsics
R = Rc.T
t = -R.dot(C).reshape(3, 1)
# The camera projection matrix is then:
# P = K [ R | t] which projects 3D world points
# to 2D homogenous image coordinates.
# Example intrinsics dont really matter ...
K = np.array([
[2000, 0, 2000],
[0, 2000, 1500],
[0, 0, 1],
])
# Sample point in front of camera
# i.e. further down the negative x-axis
# should project into center of image
point = np.array([[0, 0, -10, 1]]).T
# Project point into the camera
P = K.dot(np.hstack((R, t)))
# But when projecting it appears to be behind the camera?
project(point,P)
The only thing I can think of is that the identify rotation matrix doesn't correspond to the camera pointing down the negative z-axis with the up vector in the direction of the positive y-axis. But I can't see how this wouldn't be the case is for example I had constructed Rc from a function like gluLookAt and given it a camera at the origin pointing down the negative z-axis I would get the identity matrix.
In computer vision a camera matrix or (camera) projection matrix is a. matrix which describes the mapping of a pinhole camera from 3D points in the world to 2D points in an image.
Coordinates of point in world space are defined with respect to the world Cartesian coordinate system. The space in which points are defined with respect to the camera coordinate system. To convert points from world to camera space, we need to multiply points in world space by the inverse of the camera-to-world matrix.
Camera projection is a 3D texturing tool that uses a camera as if it were a projector to project an image onto geometry. It becomes useful when an image is being projected on geometry that was built to look exactly like it.
Depth and Inverse Projection. When an image of a scene is captured by a camera, we lose depth information as objects and points in 3D space are mapped onto a 2D image plane. This is also known as a projective transformation, in which points in the world are converted to pixels on a 2d plane.
I think the confusion is only in this line:
if Hp[-1] < 0:
print 'Point is behind camera'
because these formulas assume the positive Z-axis goes into the screen, so actually a point with a positive Z value will be behind the camera:
if Hp[-1] > 0:
print 'Point is behind camera'
I seem to recall this choice is arbitrary to make the 3D representation play well with our 2D preconceptions: if you assume your camera is looking in the -Z direction, then the negative X will be to the left when positive Y points up. And in this case, only things with negative Z will be in front of the camera.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With