Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# TPL Tasks - How many at one time

I'm learning how to use the TPL for parellizing an application I have. The application processes ZIP files, exctracting all of the files held within them and importing the contents into a database. There may be several thousand zip files waiting to be processed at a given time.

Am I right in kicking off a separate task for each of these ZIP files or is this an inefficient way to use the TPL?

Thanks.

like image 641
GrandMasterFlush Avatar asked Dec 15 '10 16:12

GrandMasterFlush


People also ask

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

What do you mean by C?

C is a structured, procedural programming language that has been widely used both for operating systems and applications and that has had a wide following in the academic community. Many versions of UNIX-based operating systems are written in C.

Is C language easy?

C is a general-purpose language that most programmers learn before moving on to more complex languages. From Unix and Windows to Tic Tac Toe and Photoshop, several of the most commonly used applications today have been built on C. It is easy to learn because: A simple syntax with only 32 keywords.

What is C language used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...


3 Answers

This seems like a problem better suited for worker threads (separate thread for each file) managed with the ThreadPool rather than the TPL. TPL is great when you can divide and conquer on a single item of data but your zip files are treated individually.

Disc I/O is going to be your bottle neck so I think that you will need to throttle the number of jobs running simultaneously. It's simple to manage this with worker threads but I'm not sure how much control you have (if nay) over the parallel for, foreach as far as how parallelism goes on at once, which could choke your process and actually slow it down.

like image 198
Paul Sasik Avatar answered Sep 28 '22 09:09

Paul Sasik


Anytime that you have a long running process, you can typically gain additional performance on multi-processor systems by making different threads for each input task. So I would say that you are most likely going down the right path.

like image 38
Mitchel Sellers Avatar answered Sep 28 '22 09:09

Mitchel Sellers


I would have thought that this would depend on if the process is limited by CPU or disk. If the process is limited by disk I'd thought that it might be a bad idea to kick off too many threads since the various extractions might just compete with each other.

This feels like something you might need to measure to get the correct answer for what's best.

like image 20
Hans Olsson Avatar answered Sep 28 '22 10:09

Hans Olsson