Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ftrace: system crash when changing current_tracer from function_graph via echo

I have been playing with ftrace recently to monitor some behavior characteristics of my system. I've been handling switching the trace on/off via a small script. After running the script, my system would crash and reboot itself. Initially, I believed that there might be an error with the script itself, but I have since determined that the crash and reboot is a result of echoing some tracer to /sys/kernel/debug/tracing/current_tracer when current_tracer is set to function_graph.

That is, the following sequence of commands will produce the crash/reboot:

echo "function_graph" > /sys/kernel/debug/tracing/current_tracer
echo "function" > /sys/kernel/debug/tracing/current_tracer

Durning the reboot after the crash caused by the above echo statements, I see a lot of output that reads:

clearing orphaned inode <inode>

I tried to reproduce this problem by replacing the current_tracer value from function_graph to something else in a C program:

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int openCurrentTracer()
{
        int fd = open("/sys/kernel/debug/tracing/current_tracer", O_WRONLY);
        if(fd < 0)
                exit(1);

        return fd;
}

int writeTracer(int fd, char* tracer)
{
        if(write(fd, tracer, strlen(tracer)) != strlen(tracer)) {
                printf("Failure writing %s\n", tracer);
                return 0;
        }

        return 1;
}

int main(int argc, char* argv[])
{
        int fd = openCurrentTracer();

        char* blockTracer = "blk";
        if(!writeTracer(fd, blockTracer))
                return 1;
        close(fd);

        fd = openCurrentTracer();
        char* graphTracer = "function_graph";
        if(!writeTracer(fd, graphTracer))
                return 1;
        close(fd);

        printf("Preparing to fail!\n");

        fd = openCurrentTracer();
        if(!writeTracer(fd, blockTracer))
                return 1;
        close(fd);

        return 0;
}

Oddly enough, the C program does not crash my system.

I originally encountered this problem while using Ubuntu (Unity environment) 16.04 LTS and confirmed it to be an issue on the 4.4.0 and 4.5.5 kernels. I have also tested this issue on a machine running Ubuntu (Mate environment) 15.10, on the 4.2.0 and 4.5.5 kernels, but was unable to reproduce the issue. This has only confused me further.

Can anyone give me insight on what is happening? Specifically, why would I be able to write() but not echo to /sys/kernel/debug/tracing/current_tracer?

Update

As vielmetti pointed out, others have had a similar issue (as seen here).

The ftrace_disable_ftrace_graph_caller() modifies jmp instruction at ftrace_graph_call assuming it's a 5 bytes near jmp (e9 ). However it's a short jmp consisting of 2 bytes only (eb ). And ftrace_stub() is located just below the ftrace_graph_caller so modification above breaks the instruction resulting in kernel oops on the ftrace_stub() with the invalid opcode like below:

The patch (shown below) solved the echo issue, but I still do not understand why echo was breaking previously when write() was not.

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index ed48a9f465f8..e13a695c3084 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call)
    jmp ftrace_stub
  #endif

 -GLOBAL(ftrace_stub)
 +/* This is weak to keep gas from relaxing the jumps */
 +WEAK(ftrace_stub)
    retq
  END(ftrace_caller)

via https://lkml.org/lkml/2016/5/16/493

like image 947
buratino Avatar asked May 31 '16 11:05

buratino


People also ask

Which are commonly available tracers under ftrace?

One of the most common uses of ftrace is the event tracing. Throughout the kernel is hundreds of static event points that can be enabled via the tracefs file system to see what is going on in certain parts of the kernel. See events. txt for more information.

How ftrace works?

how does ftrace work? ftrace is a dynamic tracing system. This means that when I start ftracing a kernel function, the function's code gets changed. So – let's suppose that I'm tracing that do_page_fault function from before.

How do you turn off tracing also search for general kernel function?

The kernel function tracing_off() can be used within the kernel to disable writing to the ring buffer, which will set this file to "0". User space can re-enable tracing by echoing "1" into the file. Note, the function and event trigger "traceoff" will also set this file to zero and stop tracing.


1 Answers

Looks like you are not the only person to notice this behavior. I see

  • https://lkml.org/lkml/2016/5/13/327

as a report of the problem, and

  • https://lkml.org/lkml/2016/5/16/493

as a patch to the kernel that addresses it. Reading through that whole thread it appears that the issue is some compiler optimizations.

like image 145
vielmetti Avatar answered Oct 17 '22 14:10

vielmetti