Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read, understand, analyze, and debug a Linux kernel panic?

Consider the following Linux kernel dump stack trace; e.g., you can trigger a panic from the kernel source code by calling panic("debugging a Linux kernel panic");:

[<001360ac>] (unwind_backtrace+0x0/0xf8) from [<00147b7c>] (warn_slowpath_common+0x50/0x60) [<00147b7c>] (warn_slowpath_common+0x50/0x60) from [<00147c40>] (warn_slowpath_null+0x1c/0x24) [<00147c40>] (warn_slowpath_null+0x1c/0x24) from [<0014de44>] (local_bh_enable_ip+0xa0/0xac) [<0014de44>] (local_bh_enable_ip+0xa0/0xac) from [<0019594c>] (bdi_register+0xec/0x150) 
  • In unwind_backtrace+0x0/0xf8 what does +0x0/0xf8 stand for?
  • How can I see the C code of unwind_backtrace+0x0/0xf8?
  • How to interpret the panic's content?
like image 767
0x90 Avatar asked Nov 20 '12 07:11

0x90


People also ask

How do you analyze kernel panic?

To identify the cause of kernel panic, you can use the kdump service to collect crash dumps, perform a root cause analysis and troubleshoot the system. To get started, you should have two VMs that run CentOS. This tutorial uses CentOS 8 as the Linux distribution for both the Network File System (NFS) server and client.

How do I debug kernel panic?

cd to your directory of your kernel tree and run gdb on the “.o” file which has the function sd_remove() in this case in sd.o, and use the gdb “list” command, (gdb) list *(function+0xoffset), in this case function is sd_remove() and offset is 0x20, and gdb should tell you the line number where you hit the panic or oops ...

Do you know panic and oops errors in Kernel Crash?

Oops is a way to debug kernel code, and there are utilities for helping with that. A kernel panic means the system cannot recover and must be restarted. However, with an Oops, the system can usually continue. You can configure klogd and syslogd to log oops messages to files, rather than to std out.

How do I find the kernel panic log in Linux?

Kernel log messages can be viewed in /var/log/dmesg files even after restart of the system. There will be so many files with dmesg. X, and those files are previous kernel logs. dmesg is the latest file.


2 Answers

Here are two alternatives for addr2line. Assuming you have the proper target's toolchain, you can do one of the following:

Use objdump:

  1. locate your vmlinux or the .ko file under the kernel root directory, then disassemble the object file :

    objdump -dS vmlinux > /tmp/kernel.s 
  2. Open the generated assembly file, /tmp/kernel.s. with a text editor such as vim. Go to unwind_backtrace+0x0/0xf8, i.e. search for the address of unwind_backtrace + the offset. Finally, you have located the problematic part in your source code.

Use gdb:

IMO, an even more elegant option is to use the one and only gdb. Assuming you have the suitable toolchain on your host machine:

  1. Run gdb <path-to-vmlinux>.
  2. Execute in gdb's prompt: list *(unwind_backtrace+0x10).

For additional information, you may checkout the following resources:

  1. Kernel Debugging Tricks.
  2. Debugging The Linux Kernel Using Gdb
like image 45
0x90 Avatar answered Sep 19 '22 17:09

0x90


It's just an ordinary backtrace, those functions are called in reverse order (first one called was called by the previous one and so on):

unwind_backtrace+0x0/0xf8 warn_slowpath_common+0x50/0x60 warn_slowpath_null+0x1c/0x24 ocal_bh_enable_ip+0xa0/0xac bdi_register+0xec/0x150 

The bdi_register+0xec/0x150 is the symbol + the offset/length there's more information about that in Understanding a Kernel Oops and how you can debug a kernel oops. Also there's this excellent tutorial on Debugging the Kernel

Note: as suggested below by Eugene, you may want to try addr2line first, it still needs an image with debugging symbols though, for example

addr2line -e vmlinux_with_debug_info 0019594c(+offset)

like image 80
iabdalkader Avatar answered Sep 23 '22 17:09

iabdalkader