Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running a CPU/memory-intensive task - what coding approach is the most performant?

Tags:

c#

As we all know, in software dev, we can be asked very ambitious things to do with technology.

Recently I was asked about the quickest possible way to convert 4000 documents from word to pdf. The code/software to do the conversion is in place, and it runs on a dedicated server, so the hardware is also there (this is a recurring task). But from a C# performance perspective, what is the best way to do this?

I keep thinking along the lines of breaking this up into chunks (ie 40 documents) and convert them (i.e. 40 unique documents x 1000 parellel tasks), which run at the same time. Is this the right idea, performance wise? The simplest (and longest) is a serial loop that goes through each doc.

What would you recommend? There are no language constraints so C# 4.0, LINQ etc is all available.

like image 257
GurdeepS Avatar asked Oct 09 '22 04:10

GurdeepS


1 Answers

1000 parallel tasks? You want to run 1,000 threads concurrently? You'll spend more time thread switching than doing actual work. If you have a quad-core machine, you should run four threads, each of which is converting a single document at a time.

Probably the best way to start is to use a simple Parallel.ForEach, and let the runtime library worry about scheduling the tasks. Something like:

List<string> DocumentsToConvert = new List<string>();
// here, load the file names of all the documents you want to convert.
// Then, process them with:
Parallel.Foreach(DocumentsToConvert, (doc) => { ConvertDocument(doc); });

You could do the same type of thing with the TPL and tasks:

foreach (var doc in DocumentsToConvert)
{
    // Create and start a task to convert that document
}

In either case, you let the runtime library figure out how many tasks to execute in parallel.

like image 54
Jim Mischel Avatar answered Oct 13 '22 11:10

Jim Mischel