Linux's perf utility is famously used by Brendan Gregg to generate flamegraphs for c/c++, jvm code, nodejs code, etc.
Does the Linux kernel natively understand stack traces? Where can I read more about how a tool is able to introspect into stack traces of processes, even if processes are written in completely different languages?
There is short introduction about stack traces in perf
by Gregg:
http://www.brendangregg.com/perf.html
4.4 Stack Traces
Always compile with frame pointers. Omitting frame pointers is an evil compiler optimization that breaks debuggers, and sadly, is often the default. Without them, you may see incomplete stacks from perf_events ... There are two ways to fix this: either using dwarf data to unwind the stack, or returning the frame pointers.
Dwarf
Since about the 3.9 kernel, perf_events has supported a workaround for missing frame pointers in user-level stacks: libunwind, which uses dwarf. This can be enabled using "-g dwarf". ... compiler optimizations (
-O2
), which in this case has omitted the frame pointer. ... recompiling .. with-fno-omit-frame-pointer
:
Non C-style languages may have different frame format, or may omit frame pointers too:
4.3. JIT Symbols (Java, Node.js)
Programs that have virtual machines (VMs), like Java's JVM and node's v8, execute their own virtual processor, which has its own way of executing functions and managing stacks. If you profile these using perf_events, you'll see symbols for the VM engine .. perf_events has JIT support to solve this, which requires the VM to maintain a
/tmp/perf-PID.map
file for symbol translation.Note that Java may not show full stacks to begin with, due to hotspot on x86 omitting the frame pointer (just like gcc). On newer versions (JDK 8u60+), you can use the
-XX:+PreserveFramePointer
option to fix this behavior, ...
The Gregg's blog post about Java and stack traces: http://techblog.netflix.com/2015/07/java-in-flames.html ("Fixing Frame Pointers" - fixed in some JDK8 versions and in JDK9 by adding option on program start)
Now, your questions:
How does linux's perf utility understand stack traces?
perf
utility basically (in early versions) just parses data returned from linux kernel's subsystem "perf_events
" (or sometimes "events
"), accessed with syscall perf_event_open
. For call stack trace there are options PERF_SAMPLE_CALLCHAIN
/ PERF_SAMPLE_STACK_USER
:
sample_type PERF_SAMPLE_CALLCHAIN Records the callchain (stack backtrace).
PERF_SAMPLE_STACK_USER (since Linux 3.7)
Records the user level stack, allowing stack unwinding.
Does the Linux kernel natively understand stack traces?
It may understand (if implemented) and may not, depending on your cpu architecture. The function of sampling (getting/reading call stack from live process) callchain is defined in architecture-independent part of kernel as __weak
with empty body:
http://lxr.free-electrons.com/source/kernel/events/callchain.c?v=4.4#L26
27 __weak void perf_callchain_kernel(struct perf_callchain_entry *entry,
28 struct pt_regs *regs)
29 {
30 }
31
32 __weak void perf_callchain_user(struct perf_callchain_entry *entry,
33 struct pt_regs *regs)
34 {
35 }
In 4.4 kernel user-space callchain sampler is redefined in architecture-dependent part of kernel for x86/x86_64, ARC, SPARC, ARM/ARM64, Xtensa, Tilera TILE, PowerPC, Imagination Meta:
http://lxr.free-electrons.com/ident?v=4.4;i=perf_callchain_user
arch/x86/kernel/cpu/perf_event.c, line 2279
arch/arc/kernel/perf_event.c, line 72
arch/sparc/kernel/perf_event.c, line 1829
arch/arm/kernel/perf_callchain.c, line 62
arch/xtensa/kernel/perf_event.c, line 339
arch/tile/kernel/perf_event.c, line 995
arch/arm64/kernel/perf_callchain.c, line 109
arch/powerpc/perf/callchain.c, line 490
arch/metag/kernel/perf_callchain.c, line 59
Reading of call chain from user stack may be not trivial for some architectures and/or for some modes.
What CPU architecture you use? What languages and VM are used?
Where can I read more about how a tool is able to introspect into stack traces of processes, even if processes are written in completely different languages?
You may try gdb
and/or debuggers for the language or backtrace
function of libc or support of read-only unwinding in libunwind (there is local backtrace example in libunwind, show_backtrace()
).
They may have better support of frame parsing / better integration with virtual machine of the language or with unwind info. If gdb (with backtrace
command) or other debuggers can't get stack traces from running program, there may be no way of getting stack trace at all.
If they can get call trace, but perf
can't (even after recompiling with -fno-omit-frame-pointer
for C/C++), it may be possible to add support of such combination of architecture + frame format into perf_events
and perf
.
There are several blogs with some info about generic backtracing problems and solutions:
__builtin_return_address(N)
vs glibc's backtrace()
vs libunwind's local backtraceDwarf support for perf_events
/perf
:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With