For a program to be cache efficient the data used should be stored linearly right?
So instead of dynamic allocation I put my data in a blob using a linear allocator. Is this enought to improve performace? what should I do to improve cache efficiency even more?
I know that this questions arent specific but I don't know how to explain it...
Which programs can help me profile cache hits/misses?
If your looking for a profiler for windows, you can try AMD's CodeAnalyst or VerySleepy, both of these are free, AMDs is the more powerful of the two however( and works on intel hardware, but iirc you can't use the hardware based profiling stuff), it includes monitoring of things like branch prediction misses and cache utilization. Profiling is great, as it tells you what to optimize, but you don't always know how, for that, you should have a look at Agner Fog's optimization manuals combined with Intel's optimization manual (which contains a lot on locality and cachability optimizations)
If you're on Linux you could use Valgrind(specifically cachegrind tool).
If you're on Windows then VS2010(2008) Professional edition has a builtin profiler but I don't know any details about it's cache profiling facilities. There is also the Intel VTune Analyzer(Amplifier). Both of them are commercial products, although I think you can get 30 days evaluation copies.
Some other questions on SO that might be of help:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With