All of the Stage3D examples I have seen build the model view projection matrix in AS3 on each render event. eg:
modelMatrix.identity();
// Create model matrix here
modelMatrix.translate/rotate/scale
...
modelViewProjectionMatrix.identity();
modelViewProjectionMatrix.append( modelMatrix );
modelViewProjectionMatrix.append( viewMatrix );
modelViewProjectionMatrix.append( projectionMatrix );
// Model view projection matrix to vertex constant register 0
context3D.setProgramConstantsFromMatrix( Context3DProgramType.VERTEX, 0, modelViewProjectionMatrix, true );
...
And a single line in the vertex shader transforms the vertex into screen space :
m44 op, va0, vc0
Is there a reason for doing it this way? Aren't these kinds of calculation what the GPU was made for?
Why not instead only update the view and projection matrix when they change and upload each to separate registers :
// Projection matrix to vertex constant register 0
// This could be done once on initialization or when the projection matrix changes
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 0, projectionMatrix, true);
// View matrix to vertex constant register 4
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 4, viewMatrix, true);
Then on each frame and for each object :
modelMatrix.identity();
// Create model matrix here
modelMatrix.translate/rotate/scale
...
// Model matrix to vertex constant register 8
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 8, modelMatrix, true);
...
And the shader would instead look like this :
// Perform model view projection transformation and store the results in temporary register 0 (vt0)
// - Multiply vertex position by model matrix (vc8)
m44 vt0 va0 vc8
// - Multiply vertex position by view matrix (vc4)
m44 vt0 vt0 vc4
// - Multiply vertex position by projection matrix (vc0) and write the result to the output register
m44 op vt0 vc0
I have now found another question here which might have already answered this question :
DirectX world view matrix multiplications - GPU or CPU the place
This is a tough optimization problem. The first thing you should ask: Is that really a bottleneck? If yes, you have to consider this:
There is no simple answer. For speed I would let the GPU do the work. But in many cases you might want a compromise: Send the model->world and the world->clip matrix like classic OpenGL. For molehill specifically do more work on the GPU in the vertex program. But always make sure that this issue is really a bottleneck before worrying about it too much.
tl/dr: Do it in the vertex program if you can!
Don't forget that the vertex shader runs per vertex and you end up doing the multiplication hundreds of thousounds of times per frame,
while the AS3 version only does the multiplication once per frame.
As with every performance problem:
Optimize stuff that runs often and ignore the things that run only now and then.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With