Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cache performance of vectors, matrices and quaternions

I've noticed on a number of occasions in the past, C and C++ code that uses the following format for these structures:

class Vector3
{
    float components[3];
    //etc.
}

class Matrix4x4
{
    float components[16];
    //etc.
}

class Quaternion
{
    float components[4];
    //etc.
}

My question is, will this lead to any better cache performance than say, this:

class Quaternion
{
    float x;
    float y;
    float z;
    //etc.
}

...Since I'd assume the class members and functions are in contiguous memory space, anyway? I currently use the latter form because I find it more convenient (however I can also see the practical sense in the array form, since it allows one to treat axes as arbitrary dependant on the operation being performed).


Afer taking some advice from the respondents, I tested the difference and it is actually slower with the array -- I get about 3% difference in framerate. I implemented operator[] to wrap the array access inside the Vector3. Not sure if this has anything to do with it, but I doubt it since that should be inlined anyway. The only factor I could see was that I could no longer use a constructor initializer list on Vector3(x, y, z). However when I took the original version and changed it to no longer use constructor initialiser lists, it ran very marginally slower than before (less than 0.05%). No clue, but at least now I know the original approach was faster.

like image 241
Engineer Avatar asked Nov 07 '11 14:11

Engineer


1 Answers

These declarations are not equivalent with respect to memory layout.

class Quaternion
{
    float components[4];
    //etc.
}

The above guarantees that the elements are continuous in memory, while, if they are individual members like in your last example, the compiler is allowed to insert padding between them (for instance to align the members with certain address-patterns).

Whether or not this results in better or worse performance depends on your mostly on your compiler, so you'd have to profile it.

like image 103
Björn Pollex Avatar answered Sep 21 '22 14:09

Björn Pollex