Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linux perf events: cpu-clock and task-clock - what is the difference

Linux perf tools (some time ago named perf_events) has several builtin universal software events. Two most basic of them are: task-clock and cpu_clock (internally called PERF_COUNT_SW_CPU_CLOCK and PERF_COUNT_SW_TASK_CLOCK). But what is wrong with them is lack of description.

ysdx user reports that man perf_event_open has short description:

    PERF_COUNT_SW_CPU_CLOCK
          This reports the CPU clock, a high-resolution per-
          CPU timer.

    PERF_COUNT_SW_TASK_CLOCK
          This reports a clock count specific to the task
          that is running.

But the description is hard to understand.

Can somebody give authoritative answer about how and when the task-clock and cpu-clock events are accounted? How does they relate to the linux kernel scheduler?

When task-clock and cpu-clock will give different values? Which one should I use?

like image 524
osgx Avatar asked May 31 '14 00:05

osgx


People also ask

What is perf kernel?

perf (sometimes called perf_events or perf tools, originally Performance Counters for Linux, PCL) is a performance analyzing tool in Linux, available from Linux kernel version 2.6. 31 in 2009.

What is perf event?

perf_events provides a command line tool, perf, and subcommands for various profiling activities. This is a single interface for the different instrumentation frameworks that provide the various events. The perf command alone will list the subcommands; here is perf version 4.10 (for the Linux 4.10 kernel):

What does perf record do?

perf record is used to sample events. Display a report that was previously created with perf record . Display a report file and an annotated version of the executed code. If debug symbols are installed, you will also see the source code displayed.

How does Linux perf work?

Perf overview Perf is a facility comprised of kernel infrastructure for gathering various events and userspace tool to get gathered data from the kernel and analyze it. It is like a gprof, but it is non-invasive, low-overhead and profile the whole stack, including your app, libraries, system calls AND kernel with CPU!


2 Answers

1) By default, perf stat shows task-clock, and does not show cpu-clock. Therefore we can tell task-clock was expected to be much more useful.

2) cpu-clock was simply broken, and has not been fixed for many years. It is best to ignore it.

It was intended that cpu-clock of sleep 1 would show about 1 second. In contrast, task-clock would show close to zero. It would have made sense to use cpu-clock to read wall clock time. You could then look at the ratio between cpu-clock and task-clock.

But in the current implementation, cpu-clock is equivalent to task-clock. It is even possible that "fixing" the existing counter might break some userspace program. If there is such a program, Linux might not be able to "fix" this counter. Linux might need to define a new counter instead.

Exception: starting with v4.7-rc1, when profiling a CPU or CPUs - as opposed to a specific task - e.g. perf stat -a. perf stat -a shows cpu-clock instead of task-clock. In this specific case, the two counters were intended to be equivalent. The original intention for cpu-clock makes more sense in this case. So for perf stat -a, you could just ignore this difference, and interpret it as task-clock.

If you write your own code which profiles a CPU or CPUs - as opposed to a specific task - perhaps it would be clearest to follow the implementation of perf stat -a. But you might link to this question, to explain what your code is doing :-).

Subject: Re: perf: some questions about perf software events
From: Peter Zijlstra

On Sat, 2010-11-27 at 14:28 +0100, Franck Bui-Huu wrote:

Peter Zijlstra writes:

On Wed, 2010-11-24 at 12:35 +0100, Franck Bui-Huu wrote:

[...]

Also I'm currently not seeing any real differences between cpu-clock and task-clock events. They both seem to count the time elapsed when the task is running on a CPU. Am I wrong ?

No, Francis already noticed that, I probably wrecked it when I added the multi-pmu stuff, its on my todo list to look at (Francis also handed me a little patchlet), but I keep getting distracted with other stuff :/

OK.

Does it make sense to adjust the period for both of them ?

Also, when creating a task clock event, passing 'pid=-1' to sys_perf_event_open() doesn't really make sense, does it ?

Same with cpu clock and 'pid=n': whatever value, the event measure the cpu wall time clock.

Perhaps proposing only one clock in the API and internally bind this clock to the cpu or task clock depending on pid or cpu parameters would have been better ?

No, it actually makes sense to count both cpu and task clock on a task (cpu clock basically being wall-time).

On a more superficial level, perf stat output for cpu-clock can be slightly different from that of task-clock in perf earlier than v4.7-rc1. For example, it may print "CPUs utilized" for task-clock but not for cpu-clock.

like image 106
sourcejedi Avatar answered Oct 19 '22 11:10

sourcejedi


Generally speaking: The cpu-clock event measures the passage of time. It uses the Linux CPU clock as the timing source.

Here is an in-depth article on finding execution hot spots with perf: http://sandsoftwaresound.net/perf/perf-tutorial-hot-spots/

The task-clock tells you how parallel your job has been/how many cpus were used. This compendium contains detaild information of output generated by perf: https://doc.zih.tu-dresden.de/hpc-wiki/bin/view/Compendium/PerfTools

There is also a whole lot of information here: https://stackoverflow.com/a/20378648/8223204

like image 3
Patrick Di Martino Avatar answered Oct 19 '22 11:10

Patrick Di Martino