Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Highly-parallel F# program shows poor CPU utilization

Tags:

People also ask

What are the different levels of parallelism?

Parallelism can be detected and exploited on several different levels, including instruction level parallelism, data parallelism, functional parallelism and loop parallelism.

What is parallel system?

1. Parallel systems are the systems that can process the data simultaneously, and increase the computational speed of a computer system. In these systems, applications are running on multiple computers linked by communication lines.

What is Parallel Architecture?

Parallel Computer Architecture is the method of organizing all the resources to maximize the performance and the programmability within the limits given by technology and the cost at any instance of time. It adds a new dimension in the development of computer system by using more and more number of processors.


One of the promises of pure functional programming is that it parallelizes well. I'm testing this claim using a F# application with mediocre results. My program runs a large number of MiniMax searches in parallel via Array.Parallel. The MiniMax algorithm is pure functional code - no shared state, no locks, but highly recursive with lots of values being created and destroyed as it searches the tree. There is no I/O at all - everything is in memory. Each MiniMax search takes 5-60 seconds and I'm running about 100 of them in parallel on a fast box with 8 CPU cores. Sadly, CPU utilization peaks at about 65% and is usually in the 45-60% range.

I profiled my app using the Visual Studio Concurrency Visualizer and found that it is blocked about 40% of the time. All of the blocking calls seem to be in the .NET garbage collector or other .NET memory management routines. Is there some way to optimize this behavior without rewriting the entire program in a lower-level language such as C++? It seems clear that the problem is that I'm creating and destroying too many objects, but this is hard to avoid in idiomatic F# code. Perhaps I'm missing some other cause of the synchronization issues?

Thanks.

Update: I made two changes: Disabled hyperthreading and used gcServer in my config file. This dropped the execution time of my test case from 32 to 13 seconds! CPU utilization is also much higher. Thanks to everyone who made suggestions.