Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would make PerformanceCounterCategory.Exists hang indefinitely?

I've got an application that uses performance counters, that has worked for months. Now, on my dev machine and another developers machine, it has started hanging when I call PerformanceCounterCategory.Exists. As far as I can tell, it hangs indefinitely. It does not matter which category I use as input, and other applications using the API exhibits the same behaviour.

Debugging (using MS Symbol Servers) has shown that it is a call to Microsoft.Win32.RegistryKey that hangs. Further investigation shows that it is this line that hangs:

while (Win32Native.ERROR_MORE_DATA == (r = Win32Native.RegQueryValueEx(hkey, name, null, ref type, blob, ref sizeInput))) { 

This is basically a loop that tries to allocate enough memory for the performance counter data. It starts at size = 65000 and does a few iterations. In the 4th call, when size = 520000, Win32Native.RegQueryValueEx hangs.

Furthermore, rather worringly, I found this comment in the reference source for PerformanceCounterLib.GetData:

    // Win32 RegQueryValueEx for perf data could deadlock (for a Mutex) up to 2mins in some 
    // scenarios before they detect it and exit gracefully. In the mean time, ERROR_BUSY,
    // ERROR_NOT_READY etc can be seen by other concurrent calls (which is the reason for the 
    // wait loop and switch case below). We want to wait most certainly more than a 2min window. 
    // The curent wait time of up to 10mins takes care of the known stress deadlock issues. In most
    // cases we wouldn't wait for more than 2mins anyways but in worst cases how much ever time 
    // we wait may not be sufficient if the Win32 code keeps running into this deadlock again
    // and again. A condition very rare but possible in theory. We would get back to the user
    // in this case with InvalidOperationException after the wait time expires.

Has anyone seen this behaviour before ? What can I do to resolve this ?

like image 862
driis Avatar asked Nov 17 '10 21:11

driis


1 Answers

This issue is now fixed, and since there has been no answers here, I will add an answer here in case the question is found in future searches.

I ultimately fixed this error by stopping the print spooler service (as a temporary measure).

It looks like the reading of Performance counters actually needs to enumerate the printers on the system (confirmed by a WinDbg dump of a hanging process, where I can see in the stack trace that winspool is enumerating printers, and is stuck in a network call). This was what was actually failing on the system (and sure enough, opening the "Devices and printers" window also hung). It baffles me that a printer/network issue can actually make the performance counters go down. One would think that there was some sort of fail-safe built in for such a case.

What I am guessing, is that this is cause by a bad printer/driver on the network. I haven't re-enabled printing on the affected systems yet, since we are hunting for the bad printer.

like image 172
driis Avatar answered Oct 08 '22 21:10

driis