Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

3D Graphics Processing - How to calculate modelview matrix

I am having trouble understanding the math to convert from object space to view space. I am doing this in hardware and I have the Atranspose matrix below:

ATranspose =

         [rightx      upx     lookx    0]
         [righty      upy     looky    0]
         [rightz      upz     lookz    0]
         [-eyeright -eyeup -eyelook    1]

Then to find the point we would do:

  [x,y,z,1] = [x',y',z',1]*ATranspose

  xnew = xold*rightx + xold*righty + xold*rightz + xold*(-eyeright)

but I am not sure if this is correct.

It could also be

   [x,y,z,1]=atranspose*[x',y',z',1]T

Can someone please explain this to me? I can't find anything online about it that isn't directly opengl code related I just want to understand the math behind transforming points from object coordinates to eye coordinates.

like image 489
slimbo Avatar asked Apr 27 '11 01:04

slimbo


2 Answers

This answer is probably much longer than it needs to be. Jump down to the bottom 2 paragraphs or so if you already understand most of the matrix math.

It might be easiest to start by looking at a 1 dimensional problem. In 1D, we have points on a line. We can scale them or we can translate them. Consider three points i,j,k and transformation matrix M.

M = [ s t ]
    [ 0 1 ]

i = [1]   j = [-2]   k = [0]
    [1]       [ 1]       [1]

 j     k  i
─┴──┴──┴──┴──┴─
-2 -1  0  1  2

When we multiply by M, we get:

i' = Mi = [ s t ][ 1] = [ s+t ]
          [ 0 1 ][ 1]   [  1  ]

j' = Mj = [ s t ][-2] = [-2s+t]
          [ 0 1 ][ 1]   [  1  ]

k' = Mk = [ s t ][ 0] = [  t  ]
          [ 0 1 ][ 1]   [  1  ]

So if we assign values to s and t, then we get various transformations on our 1D 'triangle'. Scaling changes the distance between the 'points', while pure translation moves them around with respect to the origin while keeping the spacing constant:

   s=1 t=0           s=2 t=1           s=1 t=2
 j     k  i        j     k  i        j     k  i   
─┴──┴──┴──┴──┴─   ─┴──┴──┴──┴──┴─   ─┴──┴──┴──┴──┴─
-2 -1  0  1  2    -3 -1  1  3  5     0  1  2  3  4

It's important to note that order of the transformations is critical. These 1D transformations scale and then translate. If you were to translate first, then the 'point' would be a different distance from the origin and so the scaling factor would affect it differently. For this reason, the transformations are often kept in separate matrices so that the order is clear.

If we move up to 2D, we get matrix N:

   [1 0 tx][ cos(a) sin(a) 0][sx  0 0] [ sx*cos(a) sx*sin(a) tx ]   
N =[0 1 ty][-sin(a) cos(a) 0][ 0 sy 0]=[-sy*sin(a) sy*cos(a) ty ]  
   [0 0 1 ][   0      0    1][ 0  0 1] [    0         0       1 ]

This matrix will 1) scale a point by sx,sy, 2) rotate the point around the origin by a degrees, and then 3 translate the point by tx,ty. Note that this matrix is constructed under the assumption that points are represented as column vectors and that the multiplication will take place as Np. As datenwolf said, if you want to use row vector representation of points but apply the same transformation, you can transpose everything and swap the order. This is a general property of matrix multiplication: (AB)^T = (B^T)(A^T).

That said, we can talk about transformations in terms of object, world, and eye coordinates. If the eye is sitting at the origin of the world, looking down the world's negative z-axis, with +x to the right and +y up and the object, a cube, is sitting 10 units down -z (centered on the z axis), with width of 2 along the world's x, depth of 3 along the z, and height of 4 along world y. Then, if the center of the cube is the object's local frame of reference and its local axes conveniently align with the world's axes. Then the vertices of the box in object coordinates are the variations on [+/-1,+/-2,+/-1.5]^T. The near, top, right (from the eye's point-of-view) vertex has object coordinates [1,2,1.5]^T, in world coordinates, the same vertex is [1,2,-8.5]^T (1.5-10=-8.5). Because of where the eye is, which way it's pointing, and the fact that we define our eye the same way as OpenGL, that vertex has the same eye coordinates as world coordinates. So let's move and rotate the eye such that the eye's x is right(rt) and the eye's y is up and the eye's -z is look(lk) and the eye is positioned at [eyeright(ex) eyeup(ey) eyelook(ez)]^T. Since we want object coordinates transformed to eye coordinates (meaning that we'll treat the eye as the origin), we'll take the inverse of these transformations and apply them to the object vertices (after they have been transformed into world coordinates). So we'll have:

ep = [WORLD_TO_EYE]*[OBJECT_TO_WORLD]*wp;

More specifically, for our vertex of interest, we'll have:

[ rt.x  rt.y  rt.z 0][1 0 0 -ex][1 0 0  0 ][ 1 ]
[ up.x  up.y  up.z 0][0 1 0 -ey][0 1 0  0 ][ 2 ]
[-lk.x -lk.y -lk.z 0][0 0 1 -ez][0 0 1 -10][1.5]
[   0     0     0  1][0 0 0  1 ][0 0 0  1 ][ 1 ]

For convenience, I've separated out the translation the rotation of the eye affects it. Actually, now that I've written so much, this may be the point of confusion. The matrix that you gave will rotate and then translate. I assumed that the eye's translation was in world coordinates. But as you wrote it in your question, it's actually performing the translation in eye coordinates. I've also negated lk because we've defined the eye to be looking down the negative z-axis, but to make a standard rotation matrix, we want to use positive values.

Anyway, I can keep going, but maybe this answers your question already.


Continuing:

Explaining the above a little further, separating the eye's transformation into two components also makes it much easier to find the inverse. It's easy to see that if translation tx moves the eye somewhere relative to the objects in the world, we can maintain the same relative positions between the eye and points in the world by moving the everything in the world by -tx and keeping the eye stationary.

Likewise, consider the eye's orientation as defined by its default right, up, and look vectors:

     [1]      [0]      [ 0]
d_rt=[0] d_up=[1] d_lk=[ 0]
     [0]      [0]      [-1]

Creating a rotation matrix that points these three vectors in a new direction is easy. We just line up our three new axes rt, up, lk (as column vectors):

[rt.x up.x -lk.x 0]
[rt.y up.y -lk.y 0]
[rt.z up.z -lk.z 0]
[  0    0     0  1]

It's easy to see that if you augment d_rt, d_up, and d_lk and multiply by the above matrix, you get the rt, up, and lk back respectively. So we've applied the transformation that we wanted. To be a proper rotation, the three vectors must be orthonormal. This is really just a change of bases. Because of that fact, we can find the inverse of this matrix quite conveniently by taking its transpose. That's what I did above. If you apply that transposed matrix to all of the points in world coordinates and leave the eye still, the points will maintain the same position, relative to the eye, as if the eye had rotated.


For Example:

Assign (in world coordinates):

   [ 0]    [0]    [-1]     [-2]     [1.5]
rt=[ 0] up=[1] lk=[ 0] eye=[ 0] obj=[ 0 ]
   [-1]    [0]    [ 0]     [ 1]     [-3 ]

Simple camera/grid example

like image 66
JCooper Avatar answered Oct 05 '22 10:10

JCooper


If you transpose ATranspose in the second variant, i.e.

[x,y,z,w]^T = ATranspose^T * [x',y',z',w']^T

BTW, ^T means transpose so the original author probably meant

[x,y,z,w] = [x',y',z',w'] * A^T

and rewritten

[x,y,z,w]^T = A^T * [x',y',z',w']^T

then all these formulations are equally correct.

like image 27
datenwolf Avatar answered Oct 05 '22 12:10

datenwolf