Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Thread/threadpool or backgroundworker

I would like to know what to use for tasks that need alot of performance. Backgroundworker, Thread or ThreadPool?

I've been working with Threads so far, but I need to improve speed of my applications.

like image 537
user2468035 Avatar asked Jun 09 '13 10:06

user2468035


People also ask

What is the difference between BackgroundWorker and thread?

BackgroundWorker has already implemented functionality of reporting progress, completion and cancellation - so you don't need to implement it by yourself. Usage of Thread gives you more control over the async process execution (e.g. thread priority or choosing beetween foreground/background thread type).

When should you not use Threadpool?

Thread pools do not make sense when you need thread which perform entirely dissimilar and unrelated actions, which cannot be considered "jobs"; e.g., One thread for GUI event handling, another for backend processing. Thread pools also don't make sense when processing forms a pipeline.

What is the difference between thread and Threadpool?

A thread pool is - as the name suggests - a pool of worker threads which are always running. Those threads then normally take tasks from a list, execute them, then try to take the next task. If there's no task, the thread will wait.

Does TPL use Threadpool?

Thread-Pool pattern states, the work items are queued and the free threads in thread pool takes one from this queue. TPL however store the items (tasks) to queues of threads and work-stealing works if needed...


2 Answers

BackgroundWorker is the same thing as a thread pool thread. It adds the ability to run events on the UI thread. Very useful to show progress and to update the UI with the result. So its typical usage is to prevent the UI from freezing when works needs to be done. Performance is not the first goal, running code asynchronously is. This pattern is also ably extended in later .NET versions by the Task<> class and the async/await keywords.

Thread pool threads are useful to avoid consuming resources. A thread is an expensive operating system object and you can create a very limited number of them. A thread takes 5 operating system handles and a megabyte of virtual memory address space. No Dispose() method to release these handles early. The thread pool exists primarily to reuse threads and to ensure not too many of them are active. It is important that you use a thread pool thread only when the work it does is limited, ideally not taking more than half a second. And not blocking often. It is therefore best suited for short bursts of work, not anything where performance matters. Handling I/O completion is an ideal task for a TP thread.

Yes, it is possible to also use threads to improve the performance of a program. You'd do so by using Thread or a Task<> that uses TaskContinuationOptions.LongRunning. There are some hard requirements to actually get a performance improvement, they are pretty stiff:

  • You need more than one thread. In an ideal case, two threads can half the time needed to get a job done. And less, the more threads you use. Approaching that ideal is however hard, it doesn't infinitely scale. Google "Amdahl's law" for info.
  • You need a machine with a processor that has multiple cores. Easy to get these days. The number of threads you create should not exceed the number of available cores. Using more will usually lower performance.
  • You need the kind of job that's compute-bound, having the execution engine of the processor be the constrained resource. That's fairly common but certainly no slamdunk. Many jobs are actually limited by I/O throughput, like reading from a file or dbase query. Or are limited by the rate at which the processor can read data from RAM. Such jobs don't benefit from threads, you'll have multiple execution engines available but you still have only one disk and one memory bus.
  • You need an algorithm that can distribute the work across multiple threads without hardly any need for synchronization. That's usually the tricky problem to solve, many algorithms are very sequential in nature and are not easily parallelizable.
  • You'll need time and patience to get the code stable and performing well. Writing threaded code is hard and a threading race that crashes your program once a month, or produces an invalid result occasionally can be a major time sink.
like image 88
Hans Passant Avatar answered Oct 03 '22 17:10

Hans Passant


The framework for initiating CPU-intensive tasks in threads is irrelevant to your problem, unless you have overly-small-grained subtasks.

You need to split your work into subtasks that can be executed in parallel when you have more than one CPU to do so.

like image 24
Will Avatar answered Oct 03 '22 16:10

Will