Over the past weeks I've tried learning to rectify images and with the help of the people here I've managed to understand it better. About a week ago I set up a test example which I wanted to rectify (view the image from above). This worked fine (original: http://sitedezign.net/original.jpg and rectified: http://sitedezign.net/rectified.jpg) with the function T = cv2.getPerspectiveTransform(UV_cp, XYZ_gcp) where T becomes the Homography.
When I tried to do this with real world photos it failed because the realworld coordinates aren't perfectly on a plane (but simply around 10 control points measured in X, Y and Z coordinates in space). Therefor I decided to use solvePnP and hopefully be able to create a Homography which I can use.
I tried this on the test example but didn't get the results I expected: the image isn't rectified, and my Homography calculated using solvePnP isn't equal to the Homography calculated with getPerspectiveTransform.
My code:
# Set UV (image) and XYZ (real life)
UV_cp = np.array([[1300.0, 2544.0], # left down
[1607.0, 1000.0], # left up
[3681.0, 2516.0], # right down
[3320.0, 983.0]], np.float32) # right up
# Z is on 0 plane, so Z=0.0
XYZ_gcp = np.array([[0.0, 400.0, 0.0],
[0.0, 0.0, 0.0],
[300.0, 400.0, 0.0],
[300.0, 0.0, 0.0]], np.float32)
rvec, tvec = cv2.solvePnP(XYZ_gcp, UV_cp, K, D)
rotM_cam = cv2.Rodrigues(rvec)[0]
# calculate camera position (= translation), in mm from 0,0,0 point
cameraPosition = -np.matrix(rotM_cam).T * np.matrix(tvec)
# 3x3 Identity matrix
I = np.identity(3)
# [I|-C]
I1_extended = np.hstack((I,-cameraPosition))
# P = K*R*I
P_cam = K.dot(rotM_cam).dot(I1_extended)
# create P2 = image from above: R = 0,0,0, translation = x, y, z = 0,0,-1000 (mm)
R_rec = matr.getR(0.0,0.0,0.0)
newZ = -1000.0
new_cameraPosition = np.array([[0.0],[0.0],[newZ]])
I2_extended = np.hstack((I,new_cameraPosition))
P_rec = K.dot(R_rec).dot(I2_extended)
# correct Homography T from getPerspectiveTransform:
T = np.array([[4.70332834e-01, 9.35182514e-02, -4.24671558e+02],
[9.62104844e-03, 9.69462117e-01, -4.92461571e+02],
[3.54859924e-06, 6.80081146e-04, 1.00000000e+00]])
# Homography Matrix = H = P_rect * pinv(P) => P2 * pinv(P1)
H = P_rec.dot(np.linalg.pinv(P_cam))
The result is a warped image, which is far from equal to the one shown above (the rectified one). Also the Homography T which should be correct (from getPerspectiveTransform) is not close to equal to the homography calculated using the result from solvePnP (H).
H from solvePnP:
[[ 1.01865631e+00 2.68683332e-01 -2.04519580e+03]
[ -3.24304366e-02 6.82672680e-01 -1.15688010e+03]
[ 2.03399902e-05 1.24191993e-04 -5.41378561e-01]]
H from getPerspectiveTransform:
[[ 4.70332834e-01 9.35182514e-02 -4.24671558e+02]
[ 9.62104844e-03 9.69462117e-01 -4.92461571e+02]
[ 3.54859924e-06 6.80081146e-04 1.00000000e+00]]
Anyone has an idea what is going wrong?
PS: The code to determine the K matrix and distortion coefficients (values are taken from my camera Pentax K-5 at 33mm focal length according to Adobe Camera Raw):
# Focal length, sensor size (mm and px)
f = 33.0 # mm
pix_width = 4928.0 # sensor size has 4928px in width
pix_height = 3624.0 # sensor size has 4928px in width
sensor_width = 23.7 # mm
sensor_height = 15.7 # mm
# set center pixel
u0 = int(pix_width / 2.0)
v0 = int(pix_height / 2.0)
# determine values of camera-matrix
mu = pix_width / sensor_width # px/mm
alpha_u = f * mu # px
mv = pix_height / sensor_height # px/mm
alpha_v = f * mv # px
# Distortion coefs
D = np.array([[0.0, 0.0, 0.0, 0.0]])
# Camera matrix
K = np.array([[alpha_u, 0.0, u0],
[0.0, alpha_v, v0],
[0.0, 0.0, 1.0]])
Your K
matrix seems appropriate, but this might not be sufficient to achieve good accuracy with real images. I think instead of giving a priori reasonable values (in particular for the optical center pixel & lens distortion coefficients), you should calibrate your camera using the calibrateCamera
function (documentation link, tutorials). However, I don't think the problem you are mentionning is caused by that.
I think your problem comes from the definition of P_rec
.
First, be aware that if you use newZ = -1000.0
, you are actually translating the camera by 1000 meters (not millimeters).
Second, you have to be very careful what 3D points you are considering and where you want them to be projected in the image:
As you used XYZ_gcp
in the solvePnP
function, this means that you are using these coordinates as 3D points.
As you used XYZ_gcp
in the getPerspectiveTransform
function, this means that you are also using these as 2D coordinates. Notice that, strictly speaking, you cannot do this because getPerspectiveTransform
expects two 4x2 arrays (not a 4x2 and a 4x3), but I'll assume you dropped the third coordinates which are always 0.
Hence, your P_rec
should be defined so that [x; y; 1] = P_rec * [x; y; 0; 1]. Therefore, P_rec should be defined as follows:
P_rec = [ [1 0 0 0] [0 1 0 0] [0 0 0 1] ].
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With