When I use the perf record
on my code, I find three choices for the --call-graph
option: lbr
(last branch record), dwarf
and fp
.
What is difference between these?
perf record is used to sample events. Display a report that was previously created with perf record . Display a report file and an annotated version of the executed code. If debug symbols are installed, you will also see the source code displayed.
perf report is able to auto-detect whether a perf. data file contains branch stacks and it will automatically switch to the branch view mode, unless --no-branch-stack is used. --branch-history Add the addresses of sampled taken branches to the callstack. This allows to examine the path the program took to each sample.
The perf command is used as a primary interface to the Linux kernel performance monitoring capabilities and can record CPU performance counters and trace points.
The option --call-graph
refers to the collection of call graphs / call chains, i.e. the function stack for a sample.
The default, fp
, uses frame pointers. This is very efficient but can be unreliable, particularly for optimized code. By explicitly using -fno-omit-frame-pointer
, you can ensure that this is available for your code. Nevertheless, the result for libraries may vary.
With dwarf
, perf
actually collects and stores a part of the stack memory itself and unwinds it with post-processing. This can be very resource consuming and may have limited stack depth. The default stack memory chunk is 8 kiB, but can be configured.
lbr
stands for last branch records. This is a hardware mechanism support by Intel CPUs. This will probably offer the best performance at the cost of portability. lbr
is also limited to userspace functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With