Running perf stat ls shows this:
Performance counter stats for 'ls':            1.388670 task-clock                #    0.067 CPUs utilized                            2 context-switches          #    0.001 M/sec                                    0 cpu-migrations            #    0.000 K/sec                                  266 page-faults               #    0.192 M/sec                              3515391 cycles                    #    2.531 GHz                                2096636 stalled-cycles-frontend   #   59.64% frontend cycles idle       <not supported> stalled-cycles-backend              2927468 instructions              #    0.83  insns per cycle                                                      #    0.72  stalled cycles per insn             615636 branches                  #  443.328 M/sec                                22172 branch-misses             #    3.60% of all branches                 0.020657192 seconds time elapsed Why is stalled-cycles-backend shown as "not supported"? What kind of CPU, hardware, kernel or user-space software do I need to see this value?
Currently tried this on RHEL with Linux 3.12 for x86_64, with matching perf version, on different Intel Core i5 and i7 systems (Ivy Bridge type). None of them support stalled-cycles-backend.
Some more info:
$ perf list | grep stalled   stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]   stalled-cycles-frontend OR cpu/stalled-cycles-frontend/ [Kernel PMU event]  $ ls /sys/devices/cpu/events/ branch-instructions  bus-cycles    cache-references  instructions  mem-stores branch-misses        cache-misses  cpu-cycles        mem-loads     stalled-cycles-frontend  $ cat /sys/bus/event_source/devices/cpu/events/stalled-cycles-frontend event=0x0e,umask=0x01,inv,cmask=0x01 Edit: just tried this on an AMD Phenom II X6 1045T CPU, under Ubuntu 12.04 with Linux 3.2 (32bit) - and here it does show values for both stalled-cycles-frontend and stalled-cycles-backend.
Looks like perf has not been updated to understand all the performance monitoring events that Ivy Bridge supports.  Fortunately there is a generic, albeit cumbersome, interface that allows you to access the full list of performance monitoring events.  I didn't see stalled-cycles-backend in the list when I gave it a quick look, but maybe I missed, or maybe they have broken it down by all the different events that could stall the backend.
We start with
perf list --help ...shows the following NOTE
    1. Intel(R) 64 and IA-32 Architectures Software Developer's Manual        Volume 3B: System Programming Guide        http://www.intel.com/Assets/PDF/manual/253669.pdf ...armed with that URL you end up in
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf ...you want section 19.3
19.3 PERFORMANCE MONITORING EVENTS FOR 3RD GENERATION INTEL® CORE™ PROCESSORS 3rd generation Intel® Core™ processors and Intel Xeon processor E3-1200 v2 product family are based on Intel microarchitecture code name Ivy Bridge. They support architectural performance-monitoring events listed in Table 19-1. Non-architectural performance-monitoring events in the processor core are listed in Table 19-5. The events in Table 19-5 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3AH.
...so for architectural events you need Table 19-1
19.1 ARCHITECTURAL PERFORMANCE-MONITORING EVENTS Architectural performance events are introduced in Intel Core Solo and Intel Core Duo processors. They are also supported on processors based on Intel Core microarchitecture. Table 19-1 lists pre-defined architectural performance events that can be configured using general-purpose performance counters and associated event-select registers.
**Table 19-1. Architectural Performance Events


... now comes the tricky part, you take the UMask Value as the upper 2 hex digits and the Event Num is the lower 2 hex digits of a 4 hex digit hardware register number to be given to perf stat.
perf stat --help -e, --event= Select the PMU event. Selection can be a symbolic event name (use perf list to list all events) or a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a hexadecimal event descriptor.
... it says NNN but you can give it NNNN.  Let's verify that this works, let's ask perf stat for cache-misses both as a symbolic event name and as a hex number from table 19-1.  We'll invoke the date command for simplicity.
$ perf stat -e r412e -e cache-misses date  Fri Mar 28 09:28:52 CDT 2014  Performance counter stats for 'date':            2292 r412e                                                                  2292 cache-misses                                                     0.003322663 seconds time elapsed  $  As you can see both reported the same number, so far so good. Now we go to Table 19-5 for the non-architectural hardware registers, of which there are too many too list here, but I'll list a few:

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With