I want to learn linux ebpf vm, if I write a ebpf program test.c, used llvm:
clang -O2 -target bpf -o test.o test.c. How to get the ebpf assembly like tcpdump -d in classic bpf, thanks.
This depends on what you mean exactly by “learn[ing] linux ebpf vm”.
If you mean learning about the instructions of eBPF, the assembly-like language itself, you can have a look at the documentation from the kernel (quite dense) or at this summarized version of the syntax from bcc project.
If you prefer to see how the internals of the eBPF virtual machine work, you can either have a look at various presentations (I recommend those from D. Borkmann), I have a list here in this blog post; or you can directly read at the kernel sources, under linux/kernel/bpf
(in particular file core.c
). Alternatively, there is a simpler userspace implementation available.
Now if you want to see the code that has been compiled from C to eBPF, here are a couple of solutions.
For my part I compile with the command presented in the tc-bpf
man page:
__bcc() {
clang -O2 -emit-llvm -c $1 -o - | \
llc -march=bpf -filetype=obj -o "`basename $1 .c`.o"
}
alias bcc=__bcc
The code is translated into eBPF and stored in one of the sections of the ELF file produced. Then I can examine my program with tools such as objdump
or readelf
. For example, if my program is in the classifier
section:
$ bcc return_zero.c
$ readelf -x classifier return_zero.o
Hex dump of section 'classifier':
0x00000000 b7000000 02000000 95000000 00000000 ................
In the above output, two instructions are displayed (little endian — the first field starting with 0x
is the offset inside the section). We could parse this to put in shape the instructions and to obtain:
b7 0 0 0000 00000002 // Load 0x02 in register r0
95 0 0 0000 00000000 // Exit and return value in r0
It is possible to dump the instructions of programs loaded (and then possibly attached to one of the available BPF hooks) in the kernel, either as eBPF assembly instructions, or as machine instructions if the program has been JIT-compiled. bpftool, relying on libbpf, is the go-to utility for doing such things. For example, one can see what programs are currently loaded, and note their ids, with:
# bpftool prog show
Then dumping the instructions for a program of a given id is as simple as:
# bpftool prog dump xlated id <id>
# bpftool prog dump jited id <id>
for eBPF or JITed (if available) instructions respectively. Output can also be formatted as JSON if necessary.
Depending on the tools you use to inject BPF into the kernel, you can generally dump the output of the in-kernel verifier, that contains most of the instructions formatted in a human-friendly way.
With the bcc set of tools (not directly related to the previous command, and not related at all with the old 16-bit compiler), you can get this by using the relevant flags for the BPF object instance, while with tc filter add dev eth0 bpf obj … verbose
this is done with the verbose
keyword.
The aforementioned userspace implementation (uBPF) has its own assembler and disassembler that might be of interest to you: it takes the “human-friendly” (add32 r0, r1
and the likes) instructions as input and converts into object files, or the other way round, respectively.
But probably more interesting, there is the support for debug info, coming along with a BPF disassembler, in LLVM itself: as of today it has recently been merged, and its author (A. Starovoitov) has sent an email about it on the netdev mailing list. This means that with clang/LLVM 4.0+, you should be able to use llvm-objdump -S -no-show-raw-insn my_file.o
to obtain a nicely formatted output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With