Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you read a segfault kernel log message

This can be a very simple question, I'm am attempting to debug an application which generates the following segfault error in the kern.log

kernel: myapp[15514]: segfault at 794ef0 ip 080513b sp 794ef0 error 6 in myapp[8048000+24000]

Here are my questions:

  1. Is there any documentation as to what are the diff error numbers on segfault, in this instance it is error 6, but i've seen error 4, 5

  2. What is the meaning of the information at bf794ef0 ip 0805130b sp bf794ef0 and myapp[8048000+24000]?

So far i was able to compile with symbols, and when i do a x 0x8048000+24000 it returns a symbol, is that the correct way of doing it? My assumptions thus far are the following:

  • sp = stack pointer?
  • ip = instruction pointer
  • at = ????
  • myapp[8048000+24000] = address of symbol?
like image 596
Sullenx Avatar asked Feb 01 '10 19:02

Sullenx


People also ask

How do you identify a segfault?

Use debuggers to diagnose segfaults Start your debugger with the command gdb core , and then use the backtrace command to see where the program was when it crashed. This simple trick will allow you to focus on that part of the code.

What is segfault in Linux?

On a Unix operating system such as Linux, a "segmentation violation" (also known as "signal 11", "SIGSEGV", "segmentation fault" or, abbreviated, "sig11" or "segfault") is a signal sent by the kernel to a process when the system has detected that the process was attempting to access a memory address that does not ...

Can the kernel segfault?

(Technically, the behavior is undefined when you try to write to memory thats not yours but one of the ways a OS can handle such a situation is by throwing a segfault). For user space code that attempts an illegal memory access, the kernel is the one that detects the illegal memory access and throws the segfault.


1 Answers

When the report points to a program, not a shared library

Run addr2line -e myapp 080513b (and repeat for the other instruction pointer values given) to see where the error is happening. Better, get a debug-instrumented build, and reproduce the problem under a debugger such as gdb.

If it's a shared library

In the libfoo.so[NNNNNN+YYYY] part, the NNNNNN is where the library was loaded. Subtract this from the instruction pointer (ip) and you'll get the offset into the .so of the offending instruction. Then you can use objdump -DCgl libfoo.so and search for the instruction at that offset. You should easily be able to figure out which function it is from the asm labels. If the .so doesn't have optimizations you can also try using addr2line -e libfoo.so <offset>.

What the error means

Here's the breakdown of the fields:

  • address - the location in memory the code is trying to access (it's likely that 10 and 11 are offsets from a pointer we expect to be set to a valid value but which is instead pointing to 0)
  • ip - instruction pointer, ie. where the code which is trying to do this lives
  • sp - stack pointer
  • error - Architecture-specific flags; see arch/*/mm/fault.c for your platform.
like image 103
Charles Duffy Avatar answered Sep 21 '22 23:09

Charles Duffy