Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should the model view projection matrix be built in Actionscript 3 or on the GPU in the vertex shader?

All of the Stage3D examples I have seen build the model view projection matrix in AS3 on each render event. eg:

modelMatrix.identity();
// Create model matrix here
modelMatrix.translate/rotate/scale
...
modelViewProjectionMatrix.identity();
modelViewProjectionMatrix.append( modelMatrix );
modelViewProjectionMatrix.append( viewMatrix );
modelViewProjectionMatrix.append( projectionMatrix );
// Model view projection matrix to vertex constant register 0
context3D.setProgramConstantsFromMatrix( Context3DProgramType.VERTEX, 0, modelViewProjectionMatrix, true );
...

And a single line in the vertex shader transforms the vertex into screen space :

m44 op, va0, vc0

Is there a reason for doing it this way? Aren't these kinds of calculation what the GPU was made for?

Why not instead only update the view and projection matrix when they change and upload each to separate registers :

// Projection matrix to vertex constant register 0
// This could be done once on initialization or when the projection matrix changes
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 0, projectionMatrix, true);
// View matrix to vertex constant register 4
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 4, viewMatrix, true);

Then on each frame and for each object :

modelMatrix.identity();
// Create model matrix here
modelMatrix.translate/rotate/scale
...
// Model matrix to vertex constant register 8
context3D.setProgramConstantsFromMatrix(Context3DProgramType.VERTEX, 8, modelMatrix, true);
...

And the shader would instead look like this :

// Perform model view projection transformation and store the results in temporary register 0 (vt0)
// - Multiply vertex position by model matrix (vc8)
m44 vt0 va0 vc8
// - Multiply vertex position by view matrix (vc4)
m44 vt0 vt0 vc4
// - Multiply vertex position by projection matrix (vc0) and write the result to the output register
m44 op vt0 vc0

UPDATE

I have now found another question here which might have already answered this question :
DirectX world view matrix multiplications - GPU or CPU the place

like image 764
cmann Avatar asked Nov 05 '22 02:11

cmann


2 Answers

This is a tough optimization problem. The first thing you should ask: Is that really a bottleneck? If yes, you have to consider this:

  • Doing the matrix multiply in AS3 is slower than it should be.
  • Extra matrix transforms in the vertex program are practically free.
  • Setting one matrix is faster than setting multiple matrices as constants!
  • Do you need the concatenated matrix somewhere else anyway? Picking maybe?

There is no simple answer. For speed I would let the GPU do the work. But in many cases you might want a compromise: Send the model->world and the world->clip matrix like classic OpenGL. For molehill specifically do more work on the GPU in the vertex program. But always make sure that this issue is really a bottleneck before worrying about it too much.

tl/dr: Do it in the vertex program if you can!

like image 93
starmole Avatar answered Dec 01 '22 13:12

starmole


Don't forget that the vertex shader runs per vertex and you end up doing the multiplication hundreds of thousounds of times per frame,

while the AS3 version only does the multiplication once per frame.

As with every performance problem:

Optimize stuff that runs often and ignore the things that run only now and then.

like image 32
Andreas Avatar answered Dec 01 '22 13:12

Andreas