Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "perf stat" output mean?

I use "perf stat" command to do a statistic of some events:

[root@root test]# perf stat -a -e "r81d0","r82d0" -v ./a
r81d0: 71800964 1269047979 1269006431
r82d0: 26655201 1284214869 1284214869

 Performance counter stats for './a':

        71,800,964 r81d0                                                        [100.00%]
        26,655,201 r82d0

       0.036892349 seconds time elapsed

(1) I know 71800964 is the count of "r81d0", but what is the meaning of 1269047979 and 1269006431?
(2) What is the meaning of "[100.00%]"?

I have tried to "perf stat --help", but can't get the explanations of these values.

like image 244
Nan Xiao Avatar asked Mar 10 '15 07:03

Nan Xiao


People also ask

What is perf stat?

perf stat: obtain event counts. perf record: record events for later reporting. perf report: break down events by process, function, etc. perf annotate: annotate assembly or source code with event counts. perf top: see live event count.

What is perf report?

perf report is able to auto-detect whether a perf. data file contains branch stacks and it will automatically switch to the branch view mode, unless --no-branch-stack is used. --branch-history Add the addresses of sampled taken branches to the callstack. This allows to examine the path the program took to each sample.

What is perf used for?

The perf command is used as a primary interface to the Linux kernel performance monitoring capabilities and can record CPU performance counters and trace points.

How does perf work in Linux?

Perf is a facility comprised of kernel infrastructure for gathering various events and userspace tool to get gathered data from the kernel and analyze it. It is like a gprof, but it is non-invasive, low-overhead and profile the whole stack, including your app, libraries, system calls AND kernel with CPU!


1 Answers

[root@root test]# perf stat -a -e "r81d0","r82d0" -v ./a
r81d0: 71800964 1269047979 1269006431
r82d0: 26655201 1284214869 1284214869

This is output from verbose option, as defined in tools/perf/builtin-stat.c file of the kernel:

391 /*
392  * Read out the results of a single counter:
393  * aggregate counts across CPUs in system-wide mode
394  */
395 static int read_counter_aggr(struct perf_evsel *counter)

408         if (verbose) {
409                 fprintf(output, "%s: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
410                         perf_evsel__name(counter), count[0], count[1], count[2]);
411         }

count is from struct perf_counts_values, defined as http://lxr.free-electrons.com/source/tools/perf/util/evsel.h?v=3.18#L12 with array of three uint64_t values, named as val, ena, run

Three count values are filled by kernel and are read from fd, opened with perf_event_open() syscall. There is related part of man perf_event_open: http://man7.org/linux/man-pages/man2/perf_event_open.2.html

   read_format
          This field specifies the format of the data returned by
          read(2) on a perf_event_open() file descriptor.

          PERF_FORMAT_TOTAL_TIME_ENABLED
                 Adds the 64-bit time_enabled field.  This can be used
                 to calculate estimated totals if the PMU is
                 overcommitted and multiplexing is happening.

          PERF_FORMAT_TOTAL_TIME_RUNNING
                 Adds the 64-bit time_running field.  This can be used
                 to calculate estimated totals if the PMU is
                 overcommitted and multiplexing is happening.  ...

perf stat enables all TIME flags if scale is true -

298         if (scale)
299                 attr->read_format = PERF_FORMAT_TOTAL_TIME_ENABLED |
300                                     PERF_FORMAT_TOTAL_TIME_RUNNING;

So, first counter is raw event count; second is proportional to the time when this event was collected and last is proportional to the total running time. This is needed when you asks perf to stat on high number of events, which can't be monitored at once (hardware usually has up to 5-7 performance monitors). In such case in-kernel perf will run subsets of required event for some parts of execution; and subsets will be changed several times. With ena and run counts, perf can estimate how inaccurate event monitoring was in case of multiplexing.

 Performance counter stats for './a':

    71,800,964 r81d0                                            [100.00%]
    26,655,201 r82d0

And in your case two events were mapped in the same time without needs of multiplexing; your ena and run counters are close. And print_aggr function prints their ratio:

1137                                 val += counter->counts->cpu[cpu].val;
1138                                 ena += counter->counts->cpu[cpu].ena;
1139                                 run += counter->counts->cpu[cpu].run;

Print_noise will output in case of -r N option to rerun task N times to get statistics (man: --repeat=<n> repeat command and print average + stddev (max: 100))

1176                                 print_noise(counter, 1.0);

And there is [100.00%] printer:

1178                                 if (run != ena)
1179                                         fprintf(output, "  (%.2f%%)",
1180                                                 100.0 * run / ena);

It will not print 100% if both run and ena times are equal, and your r82d0 event have equal. Your r81d0 event have slightly different run and ena, so 100% is printed in one line.

I know that perf stat -d can be inaccurate, because it asks for too much events; and there will be not 100% mulitplexing, but something like 53%. It means "this event was counted only in 53% of the program runtime in some random parts of it"; and if you program have several separated computing stages, events with low run/ena ratio will be less accurate.

like image 170
osgx Avatar answered Sep 29 '22 14:09

osgx