Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hardware cache events and perf

When I run perf list I see a bunch of Hardware Cache Events, as follows:

$ perf list | grep 'cache event'
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  iTLB-loads                                         [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-loads                                         [Hardware cache event]
  node-store-misses                                  [Hardware cache event]
  node-stores                                        [Hardware cache event]

These events mostly seem to return reasonable values based on tests, but I would like to know how to determine to map these events to hardware performance counter events on my system?

That is, these events are certainly implemented using one or more underlying x86 PMU counters on my Skylake CPU - but how do I know which ones?

You can look in /sys/devices/cpu/events for other hardware events, but not for "Hardware cache events".

like image 351
BeeOnRope Avatar asked Sep 04 '18 16:09

BeeOnRope


1 Answers

User @Margaret points towards a reasonable answer in the comments - read the kernel source to see the mapping for the PMU events.

We can check arch/x86/events/intel/core.c for the event definitions. I don't actually know if "core" here refers to the Core architecture, of just that this is the core fine with most definitions - but in any case it's the file you want to look at.

The key part is this section, which defines skl_hw_cache_event_ids:

static __initconst const u64 skl_hw_cache_event_ids
                [PERF_COUNT_HW_CACHE_MAX]
                [PERF_COUNT_HW_CACHE_OP_MAX]
                [PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
 [ C(L1D ) ] = {
    [ C(OP_READ) ] = {
        [ C(RESULT_ACCESS) ] = 0x81d0,  /* MEM_INST_RETIRED.ALL_LOADS */
        [ C(RESULT_MISS)   ] = 0x151,   /* L1D.REPLACEMENT */
    },
    [ C(OP_WRITE) ] = {
        [ C(RESULT_ACCESS) ] = 0x82d0,  /* MEM_INST_RETIRED.ALL_STORES */
        [ C(RESULT_MISS)   ] = 0x0,
    },
    [ C(OP_PREFETCH) ] = {
        [ C(RESULT_ACCESS) ] = 0x0,
        [ C(RESULT_MISS)   ] = 0x0,
    },
},
...

Decoding the nested initializers, you get that the L1D-dcahe-load corresponds to MEM_INST_RETIRED.ALL_LOAD and L1-dcache-load-misses to L1D.REPLACEMENT.

We can double check this with perf:

$ ocperf stat -e mem_inst_retired.all_loads,L1-dcache-loads,l1d.replacement,L1-dcache-load-misses,L1-dcache-loads,mem_load_retired.l1_hit head -c100M /dev/zero > /dev/null

 Performance counter stats for 'head -c100M /dev/zero':

        11,587,793      mem_inst_retired_all_loads                                   
        11,587,793      L1-dcache-loads                                             
            20,233      l1d_replacement                                             
            20,233      L1-dcache-load-misses     #    0.17% of all L1-dcache hits  
        11,587,793      L1-dcache-loads                                             
        11,495,053      mem_load_retired_l1_hit                                     

       0.024322360 seconds time elapsed

The "Hardware Cache" events show exactly the same values as using the underlying PMU events we guessed at by checking the source.

like image 63
BeeOnRope Avatar answered Sep 21 '22 04:09

BeeOnRope