Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Vertex shader world transform, why do we use 4 dimensional vectors?

From this site: http://www.toymaker.info/Games/html/vertex_shaders.html

We have the following code snippet:

// transformations provided by the app, constant Uniform data
float4x4 matWorldViewProj: WORLDVIEWPROJECTION;

// the format of our vertex data
struct VS_OUTPUT
{
  float4 Pos  : POSITION;
};

// Simple Vertex Shader - carry out transformation
VS_OUTPUT VS(float4 Pos  : POSITION)
{
  VS_OUTPUT Out = (VS_OUTPUT)0;
  Out.Pos = mul(Pos,matWorldViewProj);
  return Out;
}

My question is: why does the struct VS_OUTPUT have a 4 dimensional vector as its position? Isn't position just x, y and z?

like image 497
meds Avatar asked Dec 05 '22 04:12

meds


2 Answers

Because you need the w coordinate for perspective calculation. After you output from the vertex shader than DirectX performs a perspective divide by dividing by w.

Essentially if you have 32768, -32768, 32768, 65536 as your output vertex position then after w divide you get 0.5, -0.5, 0.5, 1. At this point the w can be discarded as it is no longer needed. This information is then passed through the viewport matrix which transforms it to usable 2D coordinates.

Edit: If you look at how a matrix multiplication is performed using the projection matrix you can see how the values get placed in the correct places.

Taking the projection matrix specified in D3DXMatrixPerspectiveLH

2*zn/w  0       0              0
0       2*zn/h  0              0
0       0       zf/(zf-zn)     1
0       0       zn*zf/(zn-zf)  0

And applying it to a random x, y, z, 1 (Note for a vertex position w will always be 1) vertex input value you get the following

x' = ((2*zn/w) * x) + (0 * y) + (0 * z) + (0 * w)
y' = (0 * x) + ((2*zn/h) * y) + (0 * z) + (0 * w)
z' = (0 * x) + (0 * y) + ((zf/(zf-zn)) * z) + ((zn*zf/(zn-zf)) * w)
w' = (0 * x) + (0 * y) + (1 * z) + (0 * w)

Instantly you can see that w and z are different. The w coord now just contains the z coordinate passed to the projection matrix. z contains something far more complicated.

So .. assume we have an input position of (2, 1, 5, 1) we have a zn (Z-Near) of 1 and a zf (Z-Far of 10) and a w (width) of 1 and a h (height) of 1.

Passing these values through we get

x' = (((2 * 1)/1) * 2
y' = (((2 * 1)/1) * 1
z' = ((10/(10-1)  * 5 + ((10 * 1/(1-10)) * 1)
w' = 5

expanding that we then get

x' = 4
y' = 2
z' = 4.4
w' = 5

We then perform final perspective divide and we get

x'' = 0.8
y'' = 0.4
z'' = 0.88
w'' = 1

And now we have our final coordinate position. This assumes that x and y ranges from -1 to 1 and z ranges from 0 to 1. As you can see the vertex is on-screen.

As a bizarre bonus you can see that if |x'| or |y'| or |z'| is larger than |w'| or z' is less than 0 that the vertex is offscreen. This info is used for clipping the triangle to the screen.

Anyway I think thats a pretty comprehensive answer :D

Edit2: Be warned i am using ROW major matrices. Column major matrices are transposed.

like image 93
Goz Avatar answered Jan 25 '23 17:01

Goz


Rotation is specified by a 3 dimensional matrix and translation by a vector. You can perform both transforms in a "single" operation by combining them into a single 4 x 3 matrix:

rx1 rx2 rx3 tx1
ry1 ry2 ry3 ty1
rz1 rz2 rz3 tz1

However as this isn't square there are various operations that can't be performed (inversion for one). By adding an extra row (that does nothing):

0   0   0   1

all these operations become possible (if not easy).

As Goz explains in his answer by making the "1" a non identity value the matrix becomes a perspective transformation.

like image 40
ChrisF Avatar answered Jan 25 '23 17:01

ChrisF