Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

c# Parallel vs Sequential

I got a large list to loop through (1.500.000 items), with each item I have to do a very small check. Totally during 30 seconds.

The CPU utilization when using Sequential is around 10%, so there's a lot of resources not used.

The first thought was to use Parallel, but due to the limited time duration for each item, Parallel lasts longer than a sequential Foreach, this is due to "Why was the parallel version slower than the sequential version in this example?", which explains that the creation of each task will cost time.

So I had another thought and that is to divide the list in 4 (or more) equal peaces and create a thread to loop through the items to get it faster.

Before creating my own class, is this a good approach? Or any other thoughts of how to speed things up? Or do you know a better way of handling this.

Code

The code I created for another parallel approach: (used in my own static class)

public static void ForEach<T>(IEnumerable<T> list, Action<T> body, int listDevide)
{
    // Number of items
    int items = list.Count();
    // Divided (in int, so floored)
    int listPart = items / listDevide;
    // Get numbers extra for last run
    int rest = items % listDevide;

    // List to save the actions
    var actions = new List<Action>();
    for(var x = 0; x < listDevide; x++)
    {
        // Create the actions
        actions.Add(delegate {
            foreach(var item in list.Skip(x * listPart).Take(listPart))
            {
                body.Invoke(item);
            }
        });
    }

    // Run the actions parallel
    Parallel.Invoke(actions.ToArray());
}

Remark: "rest" variable to do the last items is currently not used in this example.

Solution below, more info: http://msdn.microsoft.com/en-us/library/dd997411.aspx

like image 869
user2331234 Avatar asked Jul 04 '13 11:07

user2331234


People also ask

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

What do you mean by C?

C is a structured, procedural programming language that has been widely used both for operating systems and applications and that has had a wide following in the academic community. Many versions of UNIX-based operating systems are written in C.

Is C language easy?

C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.

What is C language used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...


1 Answers

Yes, partitioning the input array is a good approach.

In fact, Microsoft provide a Partitioner class to help with exactly this approach.

Here's an example showing how to do it:

using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading.Tasks;

namespace Demo
{
    class Program
    {
        private void run()
        {
            double sum = 0;
            Func<double, double> func = x => Math.Sqrt(Math.Sin(x));
            object locker = new object();

            double[] data = testData();

            // For each double in data[] we are going to calculate Math.Sqrt(Math.Sin(x)) and
            // add all the results together.
            //
            // To do this, we use class Partitioner to split the input array into just a few partitions,
            // (the Partitioner will use knowledge about the number of processor cores to optimize this)
            // and then add up all the values using a separate thread for each partition.
            //
            // We use threadLocalState to compute the total for each partition, and then we have to
            // add all these together to get the final sum. We must lock the additon because it isn't
            // threadsafe, and several threads could be doing it at the same time.

            Parallel.ForEach
            (
                Partitioner.Create(0, data.Length),

                () => 0.0,

                (subRange, loopState, threadLocalState) =>
                {
                    for (int i = subRange.Item1; i < subRange.Item2; i++)
                    {
                        threadLocalState += func(data[i]);
                    }

                    return threadLocalState;
                },

                finalThreadLocalState =>
                {
                    lock (locker)
                    {
                        sum += finalThreadLocalState;
                    }
                }
            );

            Console.WriteLine("Sum = " + sum);
        }

        private static double[] testData()
        {
            double[] array = new double[1000003]; // Test with an odd number of values.

            Random rng = new Random(12345);

            for (int i = 0; i < array.Length; ++i)
                array[i] = rng.Next() & 3; // Don't want large values for this simple test.

            return array;
        }

        static void Main()
        {
            new Program().run();
        }
    }
}
like image 150
Matthew Watson Avatar answered Oct 03 '22 16:10

Matthew Watson