Distributed computing vs threads

Tags:

How similar is distributed computing and threading? I've found two papers coming to quite opposite conclusions:

"Multi-Threading is Easier Than Networking. How threading is easy and similar to network code"

http://software.intel.com/file/14723

(this gives me an impression that they're so similar that after encapsulation these two approaches could be done with the same code - but maybe I'm wrong)

"A note on distributed computing"

http://research.sun.com/techrep/1994/abstract-29.html

(and this puts a strong distinction)

I'm sure the truth is somewhere in between. What's the golden mean? Are there any technologies that unify those two paradigms? Or have such attempts failed because of fundamental differences between networking and concurrency?

388

asked May 02 '09 23:05

sdcvvc

1 Answers

I've never found them to be very similar. Let me define for the purposes of this post a "node" to be one hardware thread running on one machine. So a quad core machine is four nodes, as is a cluster of four single processor boxes.

Each node will typically be running some processing, and there will need to be some type of cross-node communication. Usually the first instance of this communication is telling the node what to do. For this communication, I can use shared memory, semaphores, shared files, named pipes, sockets, remote procedure calls, distributed COM, etc. But the easiest ones to use, shared memory and semaphores, are not typically available across a network. Shared files may be available, but performance is typically poor. Sockets tend to be the most common and most flexible choice over a network, rather than the more sophisticated mechanisms. At that point you have to deal with the details of network architecture, including latency, bandwidth, packet loss, network topology, and more.

If you start with a queue of work, nodes on the same machine can use simple shared memory to get things to do. You can even write it up lockless and it will work seamlessly. With nodes over a network, where do you put the queue? If you centralize it, that machine may suffer very high bandwidth costs. Try to distribute it and things get very complicated very quickly.

What I've found, in general, is the people tackling this type of parallel architecture tend to choose embarrassingly parallel problems to solve. Raytracing comes to mind. There's not much cross-node communication required, apart from job distribution. There are many problems like this, to be sure, but I find it a bit disingenuous to suggest that distributed computing is essentially the same as threading.

Now if you're going to go write threading that behaves identically to a distributed system, using pure message passing and not assuming any thread to be the "main" one and such, then yes, they're going to be very similar. But what you've done is pretended you have a distributed architecture and implemented it in threads. The thing is that threading is a much simpler case of parallelism than true distributed computing is. You can abstract the two into a single problem, but by choosing the harder version and sticking strictly to it. And the results won't be as good as they could be when all of the nodes are local to a machine. You're not taking advantage of the special case.

124

answered Sep 21 '22 05:09

Promit

Related questions
                            
                                Thread sleep VS wait condition
                            
                                Event loop in java
                            
                                How can any single-threaded program be a valid multithreaded program?
                            
                                ConfigureAwait(false) with ADO.Net SQLConnection object
                            
                                Creating Threads within a Thread in Python
                            
                                Is there a maximum number of CPU's that a VirtualBox could bare?
                            
                                C++ 11 - Is it safe when I pass a local variable as argument into a thread
                            
                                Proper way to handle SIGTERM with multiple threads
                            
                                SynchronizedMap ConcurrentModificationException
                            
                                a simple java multithread
                            
                                Does node.js run asynchronous file reading/writing in main thread?
                            
                                Why can I "move" a static &str into multiple threads in Rust?
                            
                                Schedulers for network requests in RxSwift
                            
                                Multithreaded rendering only crashes on iOS 13
                            
                                Process Json Array concurrently as well as in order as fast in Java
                            
                                Why in this code, await is not blocking ui in flutter
                            
                                Fast MultiMap in Multi-Thread Environments
                            
                                Implementing multithreading in C# (code review)
                            
                                Can I use multithreading with Perl's DBI and Oracle?
                            
                                How to lock a file on different application levels?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Distributed computing vs threads

Tags:

language-agnostic

networking

multithreading

theory

sdcvvc

People also ask

1 Answers

Promit

Recent Activity

Donate For Us