I am learning threading in C#. However, I can't understand that which aspects of threads are actually improving performance.
Consider a scenario where there only a single core processor exists. Splitting your task into multiple threads uses the same process context (shared resource) and they run simultaneously. As the threads are just sharing time, how come their run time (turnaround time) is less than a single threaded process?
If the jobs are not CPU bound then adding more threads allows the CPU to do work when it would otherwise be idle, waiting for network or disk. If there are idle CPUs then adding more threads allows those CPUs to be scheduled.
On a single core CPU, a single process (no separate threads) is usually faster than any threading done. Threads do not magically make your CPU go any faster, it just means extra work.
By enabling hyper-threading, the execution units can process instructions from two threads simultaneously, which means fewer execution units will be idle during each clock cycle. As a result, enabling hyper-threading may significantly boost system performance.
A thread is also called a lightweight process. Threads provide a way to improve application performance through parallelism. Threads represent a software approach to improving performance of operating system by reducing the overhead thread is equivalent to a classical process.
In a single core CPU the advantage that you get is through asynchrony. Using threads is one way of achieving that (although not the only way).
Imagine the process for cooking a meal. Which do you think is faster:
Or instead:
From my experiences, the second is quicker.
The general idea here is that in many situations when programming you will have an operation that takes some time, but it doesn't require work from the CPU to be completed. A common example is IO. When you send a request off to the database to go get some information it's common for there to be other things for you to do while you wait for that request to come back. Perhaps you can send several requests and then wait for them to finish, rather than starting one, waiting on it, then starting the next, waiting, and so on (although sometimes you have to do the latter).
Now, if the work that you need to do is CPU bound work then you'll really only get benefits out of threading if you have multiple cores on your CPU such that work can actually be done in parallel, and not just asynchronously. For example, a lot of graphics related work (multiplying matrices, to give a simple example) often involves doing a lot of basic math. If you have several cores these operations often scale very well. If you don't have multiple cores (or a GPU, which is effectively a CPU with a lot of very small and simple cores) there isn't much point in using threads.
Consider a scenario where there only a single core processor exists. Splitting your task into multiple threads uses the same process context (shared resource) and they run simultaneously. As the threads are just sharing time, how come their run time (turnaround time) is less than a single threaded process?
You are entirely correct to be skeptical of any claimed speedup here.
First off, as Servy and others point out in their answers, if the jobs are not processor bound then clearly there can be some speedups here because while the processor is idle waiting for the disk or the network to come back, it could be doing the work of another thread.
But let's suppose you have two processor-bound tasks, a single processor, and either two threads or one thread. In the one-thread scenario it goes like this:
Total time: two seconds. Total jobs done: two. But here's the important bit: the client that was waiting for job 1 got their job done in only one second. The client that was waiting for job 2 had to wait two seconds.
Now if we have two threads and one CPU it goes like this:
Again, total time two seconds, but this time the client that was waiting for job 1 got their job done in 1.9 seconds, nearly 100% slower than the one-thread scenario!
So that's the moral of the story here, that you are entirely correct to point out. If the following conditions are met:
Then adding more threads only slows you down.
Libraries such as the Task Parallel Library are designed for this scenario; they try to figure out when adding more threads will make things worse, and try to only schedule as many threads as there are CPUs to serve them.
Now, if any of those conditions are not met then adding more threads is a good idea.
If the jobs are not CPU bound then adding more threads allows the CPU to do work when it would otherwise be idle, waiting for network or disk.
If there are idle CPUs then adding more threads allows those CPUs to be scheduled.
If partially-computed results are useful then adding more threads improves the situation because there are more opportunities for clients to consume partially-computed results. In our second scenario, for instance, the clients of both jobs are getting partial results every 200 milliseconds, which is fair.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With