Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I most effectively take advantage of multiple cores for short computations in .NET?

Here is the context: I am writing an interpreter in C# for a small programming language called Heron, and it has some primitive list operations which can be executed in parallel.

One of the biggest challenges I am facing is to distribute the work done by the evaluator across the different cores effectively whenever a parallelizable operation is encountered. This can be a short or long operation, it is hard to determine in advance.

One thing that I don't have to worry about is synchronizing data: the parallel operations are explicitly not allowed to modify data.

So the primary questions I have is:

  • What is the most effective way to distribute the work across threads, so that I can guarantee that the computer will distribute the work across two cores?

I am also interested in a related question:

  • Roughly how long should an operation take before we can start overcoming the overhead of separating the work onto another thread?
like image 285
cdiggins Avatar asked Feb 07 '10 04:02

cdiggins


3 Answers

If you want to do a lot with parallel operations, you're going to want to start with .Net 4.0. Here's the Parallel Programming for .Net documentation. You'll want to start here though. .Net 4.0 adds a LOT in terms of multi-core utilization. Here's a quick example:

Current 3.5 Serial method:

for(int i = 0; i < 30000; i++)
{
  doSomething(i);
}

New .Net 4.0 Parallel method:

Parallel.For(0, 30000, (i) => doSomething(i));

The Parallel.For method automatically scales across the number of cores available, you can see how fast you could start taking advantage of this. There are dozens of new libraries in the framework, supporting full thread/task management like your example as well (Including all the piping for syncing, cancellation, etc).

There are libraries for Parallel LINQ (PLINQ), Task Factories, Task Schedulers and a few others. In short, for the specific task you laid out .Net 4.0 has huge benefits for you, and I'd go ahead and grab the free beta 2 (RC coming soon) and get started. (No, I don't work for Microsoft...but rarely do I see an upcoming release fulfill a need so perfectly, so I highly recommend .Net 4.0 for you)

like image 91
Nick Craver Avatar answered Oct 18 '22 05:10

Nick Craver


Because I didn't want to develop using VS 2010, and I found that ThreadPool didn't have optimal performance for distributing work across cores (I think because it started/stopped too many threads) I ended up rolling my own. Hope that others find this useful:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;

namespace HeronEngine
{
    /// <summary>
    /// Represents a work item.
    /// </summary>
    public delegate void Task();

    /// <summary>
    /// This class is intended to efficiently distribute work 
    /// across the number of cores. 
    /// </summary>
    public static class Parallelizer 
    {
        /// <summary>
        /// List of tasks that haven't been yet acquired by a thread 
        /// </summary>
        static List<Task> allTasks = new List<Task>();

        /// <summary>
        /// List of threads. Should be one per core. 
        /// </summary>
        static List<Thread> threads = new List<Thread>();

        /// <summary>
        /// When set signals that there is more work to be done
        /// </summary>
        static ManualResetEvent signal = new ManualResetEvent(false);

        /// <summary>
        /// Used to tell threads to stop working.
        /// </summary>
        static bool shuttingDown = false;

        /// <summary>
        /// Creates a number of high-priority threads for performing 
        /// work. The hope is that the OS will assign each thread to 
        /// a separate core.
        /// </summary>
        /// <param name="cores"></param>
        public static void Initialize(int cores)
        {
            for (int i = 0; i < cores; ++i)
            {
                Thread t = new Thread(ThreadMain);
                // This system is not designed to play well with others
                t.Priority = ThreadPriority.Highest;
                threads.Add(t);
                t.Start();
            }
        }

        /// <summary>
        /// Indicates to all threads that there is work
        /// to be done.
        /// </summary>
        public static void ReleaseThreads()
        {
            signal.Set();
        }

        /// <summary>
        /// Used to indicate that there is no more work 
        /// to be done, by unsetting the signal. Note: 
        /// will not work if shutting down.
        /// </summary>
        public static void BlockThreads()
        {
            if (!shuttingDown)
                signal.Reset();
        }

        /// <summary>
        /// Returns any tasks queued up to perform, 
        /// or NULL if there is no work. It will reset
        /// the global signal effectively blocking all threads
        /// if there is no more work to be done.
        /// </summary>
        /// <returns></returns>
        public static Task GetTask()
        {
            lock (allTasks)
            {
                if (allTasks.Count == 0)
                {
                    BlockThreads();
                    return null;
                }
                Task t = allTasks.Peek();
                allTasks.Pop();
                return t;
            }
        }

        /// <summary>
        /// Primary function for each thread
        /// </summary>
        public static void ThreadMain()
        {
            while (!shuttingDown)
            {
                // Wait until work is available
                signal.WaitOne();

                // Get an available task
                Task task = GetTask();

                // Note a task might still be null becaue
                // another thread might have gotten to it first
                while (task != null)
                {
                    // Do the work
                    task();

                    // Get the next task
                    task = GetTask();
                }
            }
        }

        /// <summary>
        /// Distributes work across a number of threads equivalent to the number 
        /// of cores. All tasks will be run on the available cores. 
        /// </summary>
        /// <param name="localTasks"></param>
        public static void DistributeWork(List<Task> localTasks)
        {
            // Create a list of handles indicating what the main thread should wait for
            WaitHandle[] handles = new WaitHandle[localTasks.Count];

            lock (allTasks)
            {
                // Iterate over the list of localTasks, creating a new task that 
                // will signal when it is done.
                for (int i = 0; i < localTasks.Count; ++i)
                {
                    Task t = localTasks[i];

                    // Create an event used to signal that the task is complete
                    ManualResetEvent e = new ManualResetEvent(false);

                    // Create a new signaling task and add it to the list
                    Task signalingTask = () => { t(); e.Set(); };
                    allTasks.Add(signalingTask);

                    // Set the corresponding wait handler 
                    handles[i] = e;
                }
            }

            // Signal to waiting threads that there is work
            ReleaseThreads();

            // Wait until all of the designated work items are completed.
            Semaphore.WaitAll(handles);
        }

        /// <summary>
        /// Indicate to the system that the threads should terminate
        /// and unblock them.
        /// </summary>
        public static void CleanUp()
        {
            shuttingDown = true;
            ReleaseThreads();
        }
    }    
}
like image 45
cdiggins Avatar answered Oct 18 '22 06:10

cdiggins


I would go with thread pool even though it has its problems, MS is investing in improving it and it seems like .NET 4 will have an improved one. At this point, I think that the best thing would be to use the thread pool wrapped in your own object and wait with deciding about your own implementation

like image 2
Mitzi Avatar answered Oct 18 '22 05:10

Mitzi