Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Random Complete System Unresponsiveness Running Mathematical Functions

I have a program that loads a file (anywhere from 10MB to 5GB) a chunk at a time (ReadFile), and for each chunk performs a set of mathematical operations (basically calculates the hash).

After calculating the hash, it stores info about the chunk in an STL map (basically <chunkID, hash>) and then writes the chunk itself to another file (WriteFile).

That's all it does. This program will cause certain PCs to choke and die. The mouse begins to stutter, the task manager takes > 2 min to show, ctrl+alt+del is unresponsive, running programs are slow.... the works.

I've done literally everything I can think of to optimize the program, and have triple-checked all objects.

What I've done:

  • Tried different (less intensive) hashing algorithms.
  • Switched all allocations to nedmalloc instead of the default new operator
  • Switched from stl::map to unordered_set, found the performance to still be abysmal, so I switched again to Google's dense_hash_map.
  • Converted all objects to store pointers to objects instead of the objects themselves.
  • Caching all Read and Write operations. Instead of reading a 16k chunk of the file and performing the math on it, I read 4MB into a buffer and read 16k chunks from there instead. Same for all write operations - they are coalesced into 4MB blocks before being written to disk.
  • Run extensive profiling with Visual Studio 2010, AMD Code Analyst, and perfmon.
  • Set the thread priority to THREAD_MODE_BACKGROUND_BEGIN
  • Set the thread priority to THREAD_PRIORITY_IDLE
  • Added a Sleep(100) call after every loop.

Even after all this, the application still results in a system-wide hang on certain machines under certain circumstances.

Perfmon and Process Explorer show minimal CPU usage (with the sleep), no constant reads/writes from disk, few hard pagefaults (and only ~30k pagefaults in the lifetime of the application on a 5GB input file), little virtual memory (never more than 150MB), no leaked handles, no memory leaks.

The machines I've tested it on run Windows XP - Windows 7, x86 and x64 versions included. None have less than 2GB RAM, though the problem is always exacerbated under lower memory conditions.

I'm at a loss as to what to do next. I don't know what's causing it - I'm torn between CPU or Memory as the culprit. CPU because without the sleep and under different thread priorities the system performances changes noticeably. Memory because there's a huge difference in how often the issue occurs when using unordered_set vs Google's dense_hash_map.

What's really weird? Obviously, the NT kernel design is supposed to prevent this sort of behavior from ever occurring (a user-mode application driving the system to this sort of extreme poor performance!?)..... but when I compile the code and run it on OS X or Linux (it's fairly standard C++ throughout) it performs excellently even on poor machines with little RAM and weaker CPUs.

What am I supposed to do next? How do I know what the hell it is that Windows is doing behind the scenes that's killing system performance, when all the indicators are that the application itself isn't doing anything extreme?

Any advice would be most welcome.

like image 756
Mahmoud Al-Qudsi Avatar asked Feb 23 '10 16:02

Mahmoud Al-Qudsi


2 Answers

I know you said you had monitored memory usage and that it seems minimal here, but the symptoms sound very much like the OS thrashing like crazy, which would definitely cause general loss of OS responsiveness like you're seeing.

When you run the application on a file say 1/4 to 1/2 the size of available physical memory, does it seem to work better?

What I suspect may be happening is that Windows is "helpfully" caching your disk reads into memory and not giving up that cache memory to your application for use, forcing it to go to swap. Thus, even though swap use is minimal (150MB), it's going in and out constantly as you calculate the hash. This then brings the system to its knees.

like image 160
Mark B Avatar answered Oct 19 '22 23:10

Mark B


Some things to check:

  • Antivirus software. These often scan files as they're opened to check for viruses. Is your delay occuring before any data is read by the application?
  • General system performance. Does copying the file using Explorer also show this problem?
  • Your code. Break it down into the various stages. Write a program that just reads the file, then one that reads and writes the files, then one that just hashes random blocks of ram (i.e. remove the disk IO part) and see if any particular step is problematic. If you can get a profiler then use this as well to see if there any slow spots in your code.

EDIT

More ideas. Perhaps your program is holding on to the GDI lock too much. This would explain everything else being slow without high CPU usage. Only one app at a time can have the GDI lock. Is this a GUI app, or just a simple console app?

You also mentioned RtlEnterCriticalSection. This is a costly operation, and can hang the system quite easily, i.e. mismatched Enters and Leaves. Are you multi-threading at all? Is the slow down due to race conditions between threads?

like image 41
Skizz Avatar answered Oct 20 '22 00:10

Skizz