Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Do I Need to Convert Quaternion to 4x4 Matrix When Uploading to the Shaders?

I have read several tutorials about skeletal animation in OpenGL, they all seem to be single minded in using quaternions for rotation, 3d vector for translation, so not matrices.

But when they come to the vertex skinning process, they combine all of the quaternions and 3d vectors into a 4x4 matrix and upload the matrices to do the rest of calculations in shaders. 4x4 matrices have 16 elements while quaternion + 3d vector has only 7. So why are we converting these to 4x4 matrices before uploading ?

like image 371
deniz Avatar asked Mar 29 '13 13:03

deniz


People also ask

Why is a transformation matrix 4x4?

The 4 by 4 transformation matrix uses homogeneous coordinates, which allow to distinguish between points and vectors. Vectors have a direction and magnitude whereas points are positions specified by 3 coordinates with respect to the origin and three base vectors i, j and k that are stored in the first three columns.

Are quaternions faster than matrices?

Matrix rotations suffer from what is known as Gimbal Lock. Quaternions consume less memory and are faster to compute than matrices.

Why are quaternions needed?

Quaternions provide the information necessary to rotate a vector with just four numbers instead of the nine needed with a rotation matrix. If you are comfortable with math and matrix notation, make the leap to quaternion math below and skip the review of complex numbers and matrix mathematics in the next two sections.

What is a matrix in OpenGL?

The matrix M, that contains every translations, rotations or scaling, applied to an object is named the model matrix in OpenGL. Basically, instead of sending down the OpenGL pipeline two, or more, geometrical transformation matrices we'll send a single matrix for efficiency.


2 Answers

Because with having only two 4×4 matrices, one for each bone a vertex is assigned and weighted to, you have to do only two 4-vector 4×4-matrix multiplications and a weighted sum.

In contrast to this, if you'd submit as a separate quaternion and translation you'd have to do the equvalent of two 3-vector 3×3-matrix multiplications plus four 3-vector 3-vector additions and a weighted sum. Either you first convert your quaternion into a rotation matrix first, then to 3-vector 3×3-matrix multiplication, or you do direct 3-vector quaternion multiplication, the computational effort is about the same. And after that you have to postmultiply with the modelview matrix.

It's perfectly possible to use a 4-element vector uniform as a quaternion, but then you have to chain a lot of computations in the vertex shader: First rotate the vertex by the two quaternions, then translate it and then multiply it with the modelview matrix. By simply uploading two transformation matrix which are weighted in the shader, you save a lot of computations on the GPU. Doing the quaternion-matrix multiplication on the CPU performs the calculation only one time per bone, whereas doing it in the shader performs it for each single vertex. GPUs are great if you have to to a lot of identical computations with varying input date. But they suck if you have to calculate only a handfull of values, which are reused over large amounts of data. CPUs however love this kind of task.

The nice thing about homgenous transformations represented by 4×4 matrices is, that a single matrix can contain a whole transformation chain. If you separate rotations and translations, you have to perform the whole chain of operations in order. With only one rotation and translation it's less operations than a single 4×4 matrix transform. Add one single transformation and you've reached the break even.

The transformation matrices, even in a skeletal pose applied to a mesh, are identical for all vertices. Say the mesh has 100 vertices around a pair of bones (this is a small number, BTW), then you'd have to to the computations outlined above for each any every vertex, wasting precious GPU computation cycles. And for what? To determine some 32 scalar values (or 8 4-vectors). Now compare this: 100 4-vectors (if you only consider vertex position) vs. only 8. This is the order of magnitude of calculation overhead imposed by processing quaternion poses in the shader. Compute it once on the CPU and give it the GPU precalculated to share among the primitives. If you code it right, the whole calculation of a single matrix column will nicely fit into the CPUs pipeline, making is vastly outperform every attempt at parallelizing it. Parallelization doesn't come for free!

like image 148
datenwolf Avatar answered Oct 11 '22 00:10

datenwolf


In modern GPUs there is no restriction to what data format you upload to constant buffers.

Of course you need to write your vertex shader differently in order to use quaternions for skinning instead of matrices. In fact, we are using dual quaternion skinning in our engine.

Note that older fixed function hardware skinning indeed only worked with matrices, but that was a long time ago.

like image 22
Axel Gneiting Avatar answered Oct 10 '22 22:10

Axel Gneiting