Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why does perf stat show "stalled-cycles-backend" as <not supported>?

Running perf stat ls shows this:

Performance counter stats for 'ls':            1.388670 task-clock                #    0.067 CPUs utilized                            2 context-switches          #    0.001 M/sec                                    0 cpu-migrations            #    0.000 K/sec                                  266 page-faults               #    0.192 M/sec                              3515391 cycles                    #    2.531 GHz                                2096636 stalled-cycles-frontend   #   59.64% frontend cycles idle       <not supported> stalled-cycles-backend              2927468 instructions              #    0.83  insns per cycle                                                      #    0.72  stalled cycles per insn             615636 branches                  #  443.328 M/sec                                22172 branch-misses             #    3.60% of all branches                 0.020657192 seconds time elapsed 

Why is stalled-cycles-backend shown as "not supported"? What kind of CPU, hardware, kernel or user-space software do I need to see this value?

Currently tried this on RHEL with Linux 3.12 for x86_64, with matching perf version, on different Intel Core i5 and i7 systems (Ivy Bridge type). None of them support stalled-cycles-backend.

Some more info:

$ perf list | grep stalled   stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]   stalled-cycles-frontend OR cpu/stalled-cycles-frontend/ [Kernel PMU event]  $ ls /sys/devices/cpu/events/ branch-instructions  bus-cycles    cache-references  instructions  mem-stores branch-misses        cache-misses  cpu-cycles        mem-loads     stalled-cycles-frontend  $ cat /sys/bus/event_source/devices/cpu/events/stalled-cycles-frontend event=0x0e,umask=0x01,inv,cmask=0x01 

Edit: just tried this on an AMD Phenom II X6 1045T CPU, under Ubuntu 12.04 with Linux 3.2 (32bit) - and here it does show values for both stalled-cycles-frontend and stalled-cycles-backend.

like image 850
oliver Avatar asked Mar 28 '14 12:03

oliver


1 Answers

Looks like perf has not been updated to understand all the performance monitoring events that Ivy Bridge supports. Fortunately there is a generic, albeit cumbersome, interface that allows you to access the full list of performance monitoring events. I didn't see stalled-cycles-backend in the list when I gave it a quick look, but maybe I missed, or maybe they have broken it down by all the different events that could stall the backend.

We start with

perf list --help 

...shows the following NOTE

    1. Intel(R) 64 and IA-32 Architectures Software Developer's Manual        Volume 3B: System Programming Guide        http://www.intel.com/Assets/PDF/manual/253669.pdf 

...armed with that URL you end up in

http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf 

...you want section 19.3

19.3 PERFORMANCE MONITORING EVENTS FOR 3RD GENERATION INTEL® CORE™ PROCESSORS 3rd generation Intel® Core™ processors and Intel Xeon processor E3-1200 v2 product family are based on Intel microarchitecture code name Ivy Bridge. They support architectural performance-monitoring events listed in Table 19-1. Non-architectural performance-monitoring events in the processor core are listed in Table 19-5. The events in Table 19-5 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3AH.

...so for architectural events you need Table 19-1

19.1 ARCHITECTURAL PERFORMANCE-MONITORING EVENTS Architectural performance events are introduced in Intel Core Solo and Intel Core Duo processors. They are also supported on processors based on Intel Core microarchitecture. Table 19-1 lists pre-defined architectural performance events that can be configured using general-purpose performance counters and associated event-select registers.

**Table 19-1. Architectural Performance Events

enter image description here

enter image description here

... now comes the tricky part, you take the UMask Value as the upper 2 hex digits and the Event Num is the lower 2 hex digits of a 4 hex digit hardware register number to be given to perf stat.

perf stat --help 
   -e, --event=        Select the PMU event. Selection can be a symbolic event name (use        perf list to list all events) or a raw PMU event (eventsel+umask) in        the form of rNNN where NNN is a hexadecimal event descriptor. 

... it says NNN but you can give it NNNN. Let's verify that this works, let's ask perf stat for cache-misses both as a symbolic event name and as a hex number from table 19-1. We'll invoke the date command for simplicity.

$ perf stat -e r412e -e cache-misses date  Fri Mar 28 09:28:52 CDT 2014  Performance counter stats for 'date':            2292 r412e                                                                  2292 cache-misses                                                     0.003322663 seconds time elapsed  $  

As you can see both reported the same number, so far so good. Now we go to Table 19-5 for the non-architectural hardware registers, of which there are too many too list here, but I'll list a few:

enter image description here

like image 173
amdn Avatar answered Sep 27 '22 15:09

amdn