Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

performance penalty of message passing as opposed to shared data

There is a lot of buzz these days about not using locks and using Message passing approaches like Erlang. Or about using immutable datastructures like in Functional programming vs. C++/Java.

But what I am concerned with is the following:

  1. AFAIK, Erlang does not guarantee Message delivery. Messages might be lost. Won't the algorithm and code bloat and be complicated again if you have to worry about loss of messages? Whatever distributed algorithm you use must not depend on guaranteed delivery of messages.
  2. What if the Message is a complicated object? Isn't there a huge performance penalty in copying and sending the messages vs. say keeping it in a shared location (like a DB that both processes can access)?
  3. Can you really totally do away with shared states? I don't think so. For e.g. in a DB, you have to access and modify the same record. You cannot use message passing there. You need to have locking or assume Optimistic concurrency control mechanisms and then do rollbacks on errors. How does Mnesia work?
  4. Also, it is not the case that you always need to worry about concurrency. Any project will also have a large piece of code that doesn't have to do anything with concurrency or transactions at all (but they do have performance and speed as a concern). A lot of these algorithms depend on shared states (that's why pass-by-reference or pointers are so useful).

Given this fact, writing programs in Erlang etc is a pain because you are prevented from doing any of these things. May be, it makes programs robust, but for things like Solving a Linear Programming problem or Computing the convex hulll etc. performance is more important and forcing immutability etc. on the algorithm when it has nothing to do with Concurrency/Transactions is a poor decision. Isn't it?

like image 823
user855 Avatar asked Nov 27 '09 19:11

user855


People also ask

What is difference between message passing and shared memory?

Message passing is a time consuming process because it is implemented through kernel (system calls). In shared memory make sure that the processes are not writing to the same location simultaneously. Message passing is useful for sharing small amounts of data so that conflicts need not occur.

What are the pros and cons of shared memory compared to message passing?

An advantage of shared memory model is that memory communication is faster as compared to the message passing model on the same machine. However, shared memory model may create problems such as synchronization and memory protection that need to be addressed.

What are the three principal issues in using message passing?

1 through 12.5. 3 we consider three principal issues in message-based computing: naming, sending, and receiving.

What are the shortcomings of the message passing paradigm?

Disadvantages of Message Passing Model :Programmer has to do everything. Connection setup takes time that's why it is slower. Data transfer usually requires cooperative operations which can be difficult to achieve.


2 Answers

  1. That's real life : you need to account for this possibility regardless of the language / platform. In a distributed world (the real world), things fail: live with it.

  2. Of course there is a cost: nothing is free in our universe. But shouldn't you use another medium (e.g. file, db) instead of shuttling "big objects" in communication pipes? You can always use "message" to refer to "big objects" stored somewhere.

  3. Of course not: the idea behind functional programming / Erlang OTP is to "isolate" as much as possible the areas were "shared state" is manipulated. Futhermore, having clearly marked places where shared state is mutated helps testability & traceability.

  4. I believe you are missing the point: there is no such thing as a silver bullet. If your application cannot be successfully built using Erlang then don't do it. You can always some other part of the overall system in another fashion i.e. use a different language / platform. Erlang is no different from another language in this respect: use the right tool for the right job.

Remember: Erlang was designed to help solve concurrent, asynchronous and distributed problems. It isn't optimized for working efficiently on a shared block of memory for example... unless you count interfacing with nif functions working on shared blocks part of the game :-)

like image 101
jldupont Avatar answered Oct 07 '22 12:10

jldupont


Real-world systems are always hybrids anyway: I don't believe the modern paradigms try, in practice, to get rid of mutable data and shared state.

The objective, however, is not to need concurrent access to this shared state. Programs can be divided into the concurrent and the sequential, and use message-passing and the new paradigms for the concurrent parts.

Not every code will get the same investment: There is concern that threads are fundamentally "considered harmful". Something like Apache may need traditional concurrent threads and a key piece of technology like that may be carefully refined over a period of years so it can blast away with fully concurrent shared state. Operating system kernels are another example where "solve the problem no matter how expensive it is" may make sense.

There is no benefit to fast-but-broken: But for new code, or code that doesn't get so much attention, it may be the case that it simply isn't thread-safe, and it will not handle true concurrency, and so the relative "efficiency" is irrelevant. One way works, and one way doesn't.

Don't forget testability: Also, what value can you place on testing? Thread-based shared-memory concurrency is simply not testable. Message-passing concurrency is. So now you have the situation where you can test one paradigm but not the other. So, what is the value in knowing that the code has been tested? The danger in not even knowing if the other code will work in every situation?

like image 28
DigitalRoss Avatar answered Oct 07 '22 12:10

DigitalRoss