Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is it appropriate to multi-thread?

I think I "get" the basics of multi-threading with Java. If I'm not mistaken, you take some big job and figure out how you are going to chunk it up into multiple (concurrent) tasks. Then you implement those tasks as either Runnables or Callables and submit them all to an ExecutorService. (So, to begin with, if I am mistaken on this much, please start by correcting me!!!)

Second, I have to imagine that the code you implement inside run() or call() has to be as "parallelized" as possible, using non-blocking algorithms, etc. And that this is where the hard part is (writing parallel code). Correct? Not correct?

But the real problem I'm still having with Java concurrency (and I guess concurrency in general), and which is the true subject of this question, is:

When is it even appropriate to multi-thread in the first place?

I saw an example from another question on Stack Overflow where the poster proposed creating multiple threads for reading and processing a huge text file (the book Moby Dick), and one answerer commented that multi-threading for the purpose of reading from disk was a terrible idea. Their reasoning for this was because you'd have multiple threads introducing the overhead of context-switching, on top of an already slow process (disk access).

So that got me thinking: what classes of problems are appropriate for multi-threading, what classes of problems should always be serialized? Thanks in advance!

like image 965
IAmYourFaja Avatar asked Jun 29 '12 17:06

IAmYourFaja


People also ask

What is multi threading and when is it used?

Multithreading is the ability of a program or an operating system to enable more than one user at a time without requiring multiple copies of the program running on the computer. Multithreading can also handle multiple requests from the same user.

When should multithreading be used and when should it be avoided in C #?

We should use or we need Multithreading in C# to perform the multiple tasks at a time. The main objective of multithreading is to execute two or more parts of a program at a time to utilize the CPU time. The multithreaded program includes two or more parts that can run concurrently.


1 Answers

Multi-threading has two main advantages, IMO:

  • be able to distribute intensive work across several CPU/cores: instead of letting 3 of 4 CPU idle and do everything on a single CPU, you split the problem in 4 parts, and let each CPU work on its own part. This reduces the time it takes to execute a CPU-intensive task, and justifies the money you spent on multi-CPU hardware
  • reduce the latency of many tasks. Suppose 4 users make a request to a web server, and the requests are all handled by a single thread. Suppose the first request makes a very long database query. The thread is idle, waiting for the query to complete, and the 3 other users wait until this request is finished to get their tiny web page. If you have 4 threads, even with a single CPU, the second, third and fourth requests can be handled while the long database query is executed by the database server, and all the users are happy. So multi-threading is especially important when you have blocking IO calls, since those blocking IO calls let the CPU idle, instead of executing some other waiting tasks.

Note: the problem with reading from the same disk from multiple threads is that instead of reading the whole long file sequentially, it would force the disk to switch between various physical locations of the disk at each context switch. Since all the threads are waiting for the disk-reading to finish (they're IO-bound), this makes the reading slower than if a single thread read everything. But once the data is in memory, it would make sense to split the work between threads.

like image 129
JB Nizet Avatar answered Oct 21 '22 15:10

JB Nizet