I'm using perf to get an idea of the overhead each function of my program imposes on the total execution time. For that, I use cpu-cycles event:
perf record -e cpu-cycles -c 10000 <binary-with-arguments>
When I look at the output, I see some percentages associated with each function. But what doesn't make sense to me is a case like this: function A is called within function B and nowhere else. But the overhead percentage I get for function A is higher than B. If B calls A, that means B should include A's overhead. Or am I missing something here?
The perf command you are using only sample your programs without recording any information of the call stack. Using perf report
you get the number of samples falling into your functions independently of their calling relations.
You can use the --call-graph
option to get a tree when using perf report
:
perf record -e cpu-cycles --call-graph dwarf -c 10000 <binary-with-arguments>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With