Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hardware performance counter APIs for Windows

I'd like to use hardware performance counter, specifically x86 CPUs to obtain cache misses or branch mis-prediction. Performance counters are heavily used in advanced profilers like Intel VTune. Please don't be confused performance counters on Windows operating systems.

In order to use these counters in C/C++ program, one may use PAPI: http://icl.cs.utk.edu/papi/

This allows you to easily use performance counters, but on only Linux. PAPI once supported Windows, but not now.

Is there anyone who recently tried PAPI or other APIs to use hardware performance counters on Windows?

like image 611
Nullptr Avatar asked Jan 06 '12 21:01

Nullptr


People also ask

How do I check performance counters in Windows?

You can view performance counters using the Microsoft Windows Reliability and Performance Monitor application. Click Start > Run. In the Open field, enter perfmon , and then click OK. From Monitoring Tools, select Performance Monitor.

What is a Windows performance counter?

Windows Performance Counters provide a high-level abstraction layer that provides a consistent interface for collecting various kinds of system data such as CPU, memory, and disk usage. System administrators often use performance counters to monitor systems for performance or behavior problems.

Which are performance counters?

Performance counters are bits of code that monitor, count, or measure events in software, which allow us to see patterns from a high-level view. They are registered with the operating system during installation of the software, allowing anyone with the proper permissions to view them.


2 Answers

You can use RDPMC instruction or __readpmc MSVC compiler intrinsic, which is the same thing.

However, Windows prohibits user-mode applications to execute this instruction by setting CR4.PCE to 0. Presumably, this is done because the meaning of each counter is determined by MSR registers, which are only accessible in kernel mode. In other words, unless you're a kernel-mode module (e.g. a device driver), you are going to get "privileged instruction" trap if you attempt to execute this instruction.

If you're writing a user-mode application, your only option is (as @Christopher mentioned in comments) to write a kernel module which would execute this instruction for you (you'll incur user->kernel call penalty) and enable test signing on your machine so your presumably self-signed "driver" can be loaded. This means you can't easily distribute this app, but that'll work for in-house tuning.

like image 143
Rom Avatar answered Oct 18 '22 19:10

Rom


What about this HCP Reference? Does it not provide what you want?

like image 40
wilx Avatar answered Oct 18 '22 18:10

wilx