I know that I can use gprof to benchmark my code.
However, I have this problem -- I have a smart pointer that has an extra level of indirection (think of it as a proxy object).
As a result, I have this extra layer that effects pretty much all functions, and screws with caching.
Is there a way to measure the time my CPU wastes due to cache misses?
You could try cachegrind and it's front-end kcachegrind.
Linux supports with perf
from 2.6.31 on. This allows you to do the following:
perf record -e LLC-loads,LLC-load-misses yourExecutable
perf report
LLC-load-misses
line, annotate
. You should see the lines (in assembly code, surrounded by the the original source code) and a number indicating what fraction of last level cache misses for the lines where cache misses occurred.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With