I am trying to measure certain hardware events on a (Intel Xeon) machine with multiple (physical) processors. Specifically, I wish to know how many requests are issued for reading 'offcore' data.
I found the OFFCORE_REQUESTS hardware event in Intels documentation and it gives the event descriptor 0xB0 and for data demands, the additional mask 0x01.
Would it then be correct to tell perf to record the event 0xB1 (i.e. 0xB0 | 0x01
) and to call it as:
perf record -e r0B1 ./mytestapp someargs
Or is this incorrect?
Because perf report
shows no output for events entered like this.
The perf documentation is rather sparse in this area, apart from a tutorial entry which does not say which event it was (though this one works for me), or how it was encoded...
Any help is greatly appreciated.
perf record is used to sample events. Display a report that was previously created with perf record . Display a report file and an annotated version of the executed code. If debug symbols are installed, you will also see the source code displayed.
perf_events provides a command line tool, perf, and subcommands for various profiling activities. This is a single interface for the different instrumentation frameworks that provide the various events. The perf command alone will list the subcommands; here is perf version 4.10 (for the Linux 4.10 kernel):
perf (sometimes called perf_events or perf tools, originally Performance Counters for Linux, PCL) is a performance analyzing tool in Linux, available from Linux kernel version 2.6.
Ok, so I guess I figured it out.
For the the Intel machine I use, the format is as follows:
<umask><eventselector>
where both are hexadecimal values. The leading zeros of the umask can be dropped, but not for the event selector.
So for the event 0xB0
with the mask 0x01
I can call:
perf record -e r1B0 ./mytestapp someargs
I could not manage to find the exact parsing of it in the perf kernel code (any kernel hacker here?), but I found these sources:
man perf-list
Update: As pointed out in the comments (thank you!), the libpfm translator can be used to obtain the proper event descriptor. The website linked in the comments (Bojan Nikolic: How to monitor the full range of CPU performance events), discovered by user 'osgx' explains it in further detail.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With