Is there an easy way to quickly count the number of instructions executed (x86 instructions - which and how many each) while executing a C program ?
I use gcc version 4.7.1 (GCC)
on a x86_64 GNU/Linux
machine.
Linux perf_event_open
system call with config = PERF_COUNT_HW_INSTRUCTIONS
This Linux system call appears to be a cross architecture wrapper for performance events, including both hardware performance counters from the CPU and software events from the kernel.
Here's an example adapted from the man perf_event_open
page:
perf_event_open.c
#define _GNU_SOURCE
#include <asm/unistd.h>
#include <linux/perf_event.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <unistd.h>
#include <inttypes.h>
#include <sys/types.h>
static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
int cpu, int group_fd, unsigned long flags)
{
int ret;
ret = syscall(__NR_perf_event_open, hw_event, pid, cpu,
group_fd, flags);
return ret;
}
int
main(int argc, char **argv)
{
struct perf_event_attr pe;
long long count;
int fd;
uint64_t n;
if (argc > 1) {
n = strtoll(argv[1], NULL, 0);
} else {
n = 10000;
}
memset(&pe, 0, sizeof(struct perf_event_attr));
pe.type = PERF_TYPE_HARDWARE;
pe.size = sizeof(struct perf_event_attr);
pe.config = PERF_COUNT_HW_INSTRUCTIONS;
pe.disabled = 1;
pe.exclude_kernel = 1;
// Don't count hypervisor events.
pe.exclude_hv = 1;
fd = perf_event_open(&pe, 0, -1, -1, 0);
if (fd == -1) {
fprintf(stderr, "Error opening leader %llx\n", pe.config);
exit(EXIT_FAILURE);
}
ioctl(fd, PERF_EVENT_IOC_RESET, 0);
ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);
/* Loop n times, should be good enough for -O0. */
__asm__ (
"1:;\n"
"sub $1, %[n];\n"
"jne 1b;\n"
: [n] "+r" (n)
:
:
);
ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
read(fd, &count, sizeof(long long));
printf("Used %lld instructions\n", count);
close(fd);
}
Compile and run:
g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o perf_event_open.out perf_event_open.c
./perf_event_open.out
Output:
Used 20016 instructions
So we see that the result is pretty close to the expected value of 20000: 10k * two instructions per loop in the __asm__
block (sub
, jne
).
If I vary the argument, even to low values such as 100
:
./perf_event_open.out 100
it gives:
Used 216 instructions
maintaining that constant + 16 instructions, so it seems that accuracy is pretty high, those 16 must be just the ioctl
setup instructions after our little loop.
Now you might also be interested in:
Other events of interest that can be measured by this system call:
Tested on Ubuntu 20.04 amd64, GCC 9.3.0, Linux kernel 5.4.0, Intel Core i7-7820HQ CPU.
You can easily count the number of executed instruction using Hardware Performance Counter (HPC). In order to access the HPC, you need an interface to it. I recommended you to use PAPI Performance API.
instcount
You can use the Binary Instrumentation tool 'Pin' by Intel. I would avoid using a simulator (they are often extremely slow). Pin does most of the stuff you can do with a simulator without recompiling the binary and at a normal execution like speed (depends on the pin tool you are using).
To count the number of instructions with Pin:
cd pin-root/source/tools/ManualExample/
make all
../../../pin -t obj-intel64/inscount0.so -- your-binary-here
inscount.out
, cat inscount.out
.The output would be something like:
➜ ../../../pin -t obj-intel64/inscount0.so -- /bin/ls
buffer_linux.cpp itrace.cpp
buffer_windows.cpp little_malloc.c
countreps.cpp makefile
detach.cpp makefile.rules
divide_by_zero_unix.c malloc_mt.cpp
isampling.cpp w_malloctrace.cpp
➜ cat inscount.out
Count 716372
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With