How similar is distributed computing and threading? I've found two papers coming to quite opposite conclusions:
"Multi-Threading is Easier Than Networking. How threading is easy and similar to network code"
http://software.intel.com/file/14723
(this gives me an impression that they're so similar that after encapsulation these two approaches could be done with the same code - but maybe I'm wrong)
"A note on distributed computing"
http://research.sun.com/techrep/1994/abstract-29.html
(and this puts a strong distinction)
I'm sure the truth is somewhere in between. What's the golden mean? Are there any technologies that unify those two paradigms? Or have such attempts failed because of fundamental differences between networking and concurrency?
• A thread is a single sequence stream within in a process. Because threads have some of the properties of. processes, they are sometimes called lightweight processes. In a process, threads allow multiple executions of streams.
A process is a program under execution i.e an active program. A thread is a lightweight process that can be managed independently by a scheduler. Processes require more time for context switching as they are more heavy. Threads require less time for context switching as they are lighter than processes.
Threads communicate and synchronize with each other using fast shared memory mechanisms. Threads are quite useful for distributed system applications. Many systems interact using a client/server model of communication where a server process listens for client requests, executes them and returns the results.
Multithreading is the ability of a program or an operating system to enable more than one user at a time without requiring multiple copies of the program running on the computer. Multithreading can also handle multiple requests from the same user.
I've never found them to be very similar. Let me define for the purposes of this post a "node" to be one hardware thread running on one machine. So a quad core machine is four nodes, as is a cluster of four single processor boxes.
Each node will typically be running some processing, and there will need to be some type of cross-node communication. Usually the first instance of this communication is telling the node what to do. For this communication, I can use shared memory, semaphores, shared files, named pipes, sockets, remote procedure calls, distributed COM, etc. But the easiest ones to use, shared memory and semaphores, are not typically available across a network. Shared files may be available, but performance is typically poor. Sockets tend to be the most common and most flexible choice over a network, rather than the more sophisticated mechanisms. At that point you have to deal with the details of network architecture, including latency, bandwidth, packet loss, network topology, and more.
If you start with a queue of work, nodes on the same machine can use simple shared memory to get things to do. You can even write it up lockless and it will work seamlessly. With nodes over a network, where do you put the queue? If you centralize it, that machine may suffer very high bandwidth costs. Try to distribute it and things get very complicated very quickly.
What I've found, in general, is the people tackling this type of parallel architecture tend to choose embarrassingly parallel problems to solve. Raytracing comes to mind. There's not much cross-node communication required, apart from job distribution. There are many problems like this, to be sure, but I find it a bit disingenuous to suggest that distributed computing is essentially the same as threading.
Now if you're going to go write threading that behaves identically to a distributed system, using pure message passing and not assuming any thread to be the "main" one and such, then yes, they're going to be very similar. But what you've done is pretended you have a distributed architecture and implemented it in threads. The thing is that threading is a much simpler case of parallelism than true distributed computing is. You can abstract the two into a single problem, but by choosing the harder version and sticking strictly to it. And the results won't be as good as they could be when all of the nodes are local to a machine. You're not taking advantage of the special case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With