Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Asynchronous Options for Processing a List

I am trying to better understand the Async and the Parallel options I have in C#. In the snippets below, I have included the 5 approaches I come across most. But I am not sure which to choose - or better yet, what criteria to consider when choosing:

Method 1: Task

(see http://msdn.microsoft.com/en-us/library/dd321439.aspx)

Calling StartNew is functionally equivalent to creating a Task using one of its constructors and then calling Start to schedule it for execution. However, unless creation and scheduling must be separated, StartNew is the recommended approach for both simplicity and performance.

TaskFactory's StartNew method should be the preferred mechanism for creating and scheduling computational tasks, but for scenarios where creation and scheduling must be separated, the constructors may be used, and the task's Start method may then be used to schedule the task for execution at a later time.

// using System.Threading.Tasks.Task.Factory void Do_1() {     var _List = GetList();     _List.ForEach(i => Task.Factory.StartNew(_ => { DoSomething(i); })); } 

Method 2: QueueUserWorkItem

(see http://msdn.microsoft.com/en-us/library/system.threading.threadpool.getmaxthreads.aspx)

You can queue as many thread pool requests as system memory allows. If there are more requests than thread pool threads, the additional requests remain queued until thread pool threads become available.

You can place data required by the queued method in the instance fields of the class in which the method is defined, or you can use the QueueUserWorkItem(WaitCallback, Object) overload that accepts an object containing the necessary data.

// using System.Threading.ThreadPool void Do_2() {     var _List = GetList();     var _Action = new WaitCallback((o) => { DoSomething(o); });     _List.ForEach(x => ThreadPool.QueueUserWorkItem(_Action)); } 

Method 3: Parallel.Foreach

(see: http://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel.foreach.aspx)

The Parallel class provides library-based data parallel replacements for common operations such as for loops, for each loops, and execution of a set of statements.

The body delegate is invoked once for each element in the source enumerable. It is provided with the current element as a parameter.

// using System.Threading.Tasks.Parallel void Do_3() {     var _List = GetList();     var _Action = new Action<object>((o) => { DoSomething(o); });     Parallel.ForEach(_List, _Action); } 

Method 4: IAsync.BeginInvoke

(see: http://msdn.microsoft.com/en-us/library/cc190824.aspx)

BeginInvoke is asynchronous; therefore, control returns immediately to the calling object after it is called.

// using IAsync.BeginInvoke() void Do_4() {     var _List = GetList();     var _Action = new Action<object>((o) => { DoSomething(o); });     _List.ForEach(x => _Action.BeginInvoke(x, null, null)); } 

Method 5: BackgroundWorker

(see: http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx)

To set up for a background operation, add an event handler for the DoWork event. Call your time-consuming operation in this event handler. To start the operation, call RunWorkerAsync. To receive notifications of progress updates, handle the ProgressChanged event. To receive a notification when the operation is completed, handle the RunWorkerCompleted event.

// using System.ComponentModel.BackgroundWorker void Do_5() {     var _List = GetList();     using (BackgroundWorker _Worker = new BackgroundWorker())     {         _Worker.DoWork += (s, arg) =>         {             arg.Result = arg.Argument;             DoSomething(arg.Argument);         };         _Worker.RunWorkerCompleted += (s, arg) =>         {             _List.Remove(arg.Result);             if (_List.Any())                 _Worker.RunWorkerAsync(_List[0]);         };         if (_List.Any())             _Worker.RunWorkerAsync(_List[0]);     } } 

I suppose the obvious critieria would be:

  1. Is any better than the other for performance?
  2. Is any better than the other for error handling?
  3. Is any better than the other for monitoring/feedback?

But, how do you choose? Thanks in advance for your insights.

like image 795
Jerry Nixon Avatar asked Sep 06 '11 16:09

Jerry Nixon


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

Is C language easy?

C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr. Stroustroupe.

Is C programming hard?

C is more difficult to learn than JavaScript, but it's a valuable skill to have because most programming languages are actually implemented in C. This is because C is a “machine-level” language. So learning it will teach you how a computer works and will actually make learning new languages in the future easier.


1 Answers

Going to take these in an arbitrary order:

BackgroundWorker (#5)
I like to use BackgroundWorker when I'm doing things with a UI. The advantage that it has is having the progress and completion events fire on the UI thread which means you don't get nasty exceptions when you try to change UI elements. It also has a nice built-in way of reporting progress. One disadvantage that this mode has is that if you have blocking calls (like web requests) in your work, you'll have a thread sitting around doing nothing while the work is happening. This is probably not a problem if you only think you'll have a handful of them though.

IAsyncResult/Begin/End (APM, #4)
This is a widespread and powerful but difficult model to use. Error handling is troublesome since you need to re-catch exceptions on the End call, and uncaught exceptions won't necessarily make it back to any relevant pieces of code that can handle it. This has the danger of permanently hanging requests in ASP.NET or just having errors mysteriously disappear in other applications. You also have to be vigilant about the CompletedSynchronously property. If you don't track and report this properly, the program can hang and leak resources. The flip side of this is that if you're running inside the context of another APM, you have to make sure that any async methods you call also report this value. That means doing another APM call or using a Task and casting it to an IAsyncResult to get at its CompletedSynchronously property.

There's also a lot of overhead in the signatures: You have to support an arbitrary object to pass through, make your own IAsyncResult implementation if you're writing an async method that supports polling and wait handles (even if you're only using the callback). By the way, you should only be using callback here. When you use the wait handle or poll IsCompleted, you're wasting a thread while the operation is pending.

Event-based Asynchronous Pattern (EAP)
One that was not on your list but I'll mention for the sake of completeness. It's a little bit friendlier than the APM. There are events instead of callbacks and there's less junk hanging onto the method signatures. Error handling is a little easier since it's saved and available in the callback rather than re-thrown. CompletedSynchronously is also not part of the API.

Tasks (#1)
Tasks are another friendly async API. Error handling is straightforward: the exception is always there for inspection on the callback and nobody cares about CompletedSynchronously. You can do dependencies and it's a great way to handle execution of multiple async tasks. You can even wrap APM or EAP (one type you missed) async methods in them. Another good thing about using tasks is your code doesn't care how the operation is implemented. It may block on a thread or be totally asynchronous but the consuming code doesn't care about this. You can also mix APM and EAP operations easily with Tasks.

Parallel.For methods (#3)
These are additional helpers on top of Tasks. They can do some of the work to create tasks for you and make your code more readable, if your async tasks are suited to run in a loop.

ThreadPool.QueueUserWorkItem (#2)
This is a low-level utility that's actually used by ASP.NET for all requests. It doesn't have any built-in error handling like tasks so you have to catch everything and pipe it back up to your app if you want to know about it. It's suitable for CPU-intensive work but you don't want to put any blocking calls on it, such as a synchronous web request. That's because as long as it runs, it's using up a thread.

async / await Keywords
New in .NET 4.5, these keywords let you write async code without explicit callbacks. You can await on a Task and any code below it will wait for that async operation to complete, without consuming a thread.

like image 115
RandomEngy Avatar answered Sep 18 '22 08:09

RandomEngy