In this slides (after slide 15) it is suggested to use
void updateAims(float* aimDir, const AimingData* aim, vec3 target, uint count)
{
for(uint i = 0; i < count; i++)
{
aimDir[i] = dot3(aim->positions[i], target) * aim->mod[i];
}
}
because it's more cache efficient.
What about if I have a class
class Bot
{
vec3 position;
float mod;
float aimDir;
void UpdateAim(vec3 target)
{
aimDir = dot3(position, target) * mod;
}
};
void updateBots(Bots* pBots, uint count, vec3 target)
{
for(uint i = 0; i < count; i++)
pBots[i]->UpdateAim(target);
}
And I store all objects of that class in a single linear array.
Since they're all in the same array will there be cache misses? Why would the first approach be better?
Modern cache architectures are usually structured as lines of data, each large enough to hold several words; 64 bytes is a typical line size. When you try to read data that's not in the cache, a whole line is fetched, not just the word that you need. When writing, data in the cache is updated if it's there, but typically does not need to be fetched if it's not there.
In the first case, for every cache line of input data that's fetched, you'll use every single word of it. In the second, you'll only use some of the structure fields; fetching the others has wasted some bandwidth.
Specifically, you're fetching the old value of aimDir
each time, which isn't needed for the calculation. In general, the "object" is likely to contain more fields, which you don't want for this particular calculation, wasting even more bandwidth as they are fetched into the cache and then ignored.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With