I am playing with the affine transform in OpenCV and I am having trouble getting an intuitive understanding of it workings, and more specifically, just how do I specify the parameters of the map matrix so I can get a specific desired result.
To setup the question, the procedure I am using is 1st to define a warp matrix, then do the transform.
In OpenCV the 2 routines are (I am using an example in the excellent book OpenCV by Bradski & Kaehler):
cvGetAffineTransorm(srcTri, dstTri, warp_matrix);
cvWarpAffine(src, dst, warp_mat);
To define the warp matrix, srcTri
and dstTri
are defined as:
CvPoint2D32f srcTri[3], dstTri[3];
srcTri[3]
is populated as follows:
srcTri[0].x = 0;
srcTri[0].y = 0;
srcTri[1].x = src->width - 1;
srcTri[1].y = 0;
srcTri[2].x = 0;
srcTri[2].y = src->height -1;
This is essentially the top left point, top right point, and bottom left point of the image for starting point of the matrix. This part makes sense to me.
But the values for dstTri[3]
just are confusing, at least, when I vary a single point, I do not get the result I expect.
For example, if I then use the following for the dstTri[3]
:
dstTri[0].x = 0;
dstTri[0].y = 0;
dstTri[1].x = src->width - 1;
dstTri[1].y = 0;
dstTri[2].x = 0;
dstTri[2].y = 100;
It seems that the only difference between the src and the dst point is that the bottom left point is moved to the right by 100 pixels. Intuitively, I feel that the bottom part of the image should be shifted to the right by 100 pixels, but this is not so.
Also, if I use the exact same values for dstTri[3]
that I use for srcTri[3]
, I would think that the transform would produce the exact same image--but it does not.
Clearly, I do not understand what is going on here. So, what does the mapping from the srcTri[]
to the dstTri[]
represent?
Here is a mathematical explanation of affine transform: this is a matrix of size 3x3 that applies the foolowing transformations on 2D vector: Scale in X axis, scaleY, rotation, skew, translation in x axis and y. These are 6 transformations and thus you have six elements in your 3x3 matrix. The bottom row is always [0 0 1]. Why? because the bottom row represents the a perspective transformation in axis x and y, and affine transformation does not include perspective transform. (If you want to apply perspective warping use homography: also 3x3 matrix )
What is the relation between 6 values you insert into affine matrix and the 6 transformation it does? Let us look at this 3x3 matrix like
e*Zx*cos(a), -q1*sin(a) , dx,
e*q2*sin(a), Z y*cos(a), dy,
0 , 0 , 1
The above explanation is mathematical. It assumes you multiply the matrix by column vector from the right. As far as I remember, Matlab uses reverse multiplication (row vector from the left) so you will need to transpose this matrix. I am pretty sure that openCV uses regular multiplication but you need to check it. Just enter only translation matrix (x shifted by 10 pixels, y by 1).
1,0,10
0,1,1
0,0,1
If you see a normal shift than everything is OK, but If shit appears than transpose the matrix to:
1,0,0
0,1,0
10,1,1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With