Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How/why do functional languages (specifically Erlang) scale well?

I have been watching the growing visibility of functional programming languages and features for a while. I looked into them and didn't see the reason for the appeal.

Then, recently I attended Kevin Smith's "Basics of Erlang" presentation at Codemash.

I enjoyed the presentation and learned that a lot of the attributes of functional programming make it much easier to avoid threading/concurrency issues. I understand the lack of state and mutability makes it impossible for multiple threads to alter the same data, but Kevin said (if I understood correctly) all communication takes place through messages and the mesages are processed synchronously (again avoiding concurrency issues).

But I have read that Erlang is used in highly scalable applications (the whole reason Ericsson created it in the first place). How can it be efficient handling thousands of requests per second if everything is handled as a synchronously processed message? Isn't that why we started moving towards asynchronous processing - so we can take advantage of running multiple threads of operation at the same time and achieve scalability? It seems like this architecture, while safer, is a step backwards in terms of scalability. What am I missing?

I understand the creators of Erlang intentionally avoided supporting threading to avoid concurrency problems, but I thought multi-threading was necessary to achieve scalability.

How can functional programming languages be inherently thread-safe, yet still scale?

like image 518
Jim Anderson Avatar asked Jan 23 '09 20:01

Jim Anderson


People also ask

Why is functional programming better for big data?

Big Data means that the problem sizes are so big that distributing the processing is not an optimization but a fundamental requirement. And functional programming makes it much easier to write correct and efficient distributed code due to immutable data structures and statelessness.

Is Erlang a functional programming language?

What is Erlang, and where is it used? Erlang is a functional, general-purpose language oriented towards building scalable, concurrent systems with high availability guarantees. It was built at the end of the 1980s at Ericsson for handling telephone switches.

Why is functional programming better for parallelism?

It can make it faster, it can make it slower, it can make it use more memory, but it cannot change its result. This makes it way easier to experiment with parallelism to find just the right amount of parallelism and the right size of chunks.

Why functional programming is the best?

Advantages Of Functional ProgrammingIt improves modularity. It allows us to implement lambda calculus in our program to solve complex problems. Some programming languages support nested functions which improve maintainability of the code. It reduces complex problems into simple pieces.


2 Answers

A functional language doesn't (in general) rely on mutating a variable. Because of this, we don't have to protect the "shared state" of a variable, because the value is fixed. This in turn avoids the majority of the hoop jumping that traditional languages have to go through to implement an algorithm across processors or machines.

Erlang takes it further than traditional functional languages by baking in a message passing system that allows everything to operate on an event based system where a piece of code only worries about receiving messages and sending messages, not worrying about a bigger picture.

What this means is that the programmer is (nominally) unconcerned that the message will be handled on another processor or machine: simply sending the message is good enough for it to continue. If it cares about a response, it will wait for it as another message.

The end result of this is that each snippet is independent of every other snippet. No shared code, no shared state and all interactions coming from a a message system that can be distributed among many pieces of hardware (or not).

Contrast this with a traditional system: we have to place mutexes and semaphores around "protected" variables and code execution. We have tight binding in a function call via the stack (waiting for the return to occur). All of this creates bottlenecks that are less of a problem in a shared nothing system like Erlang.

EDIT: I should also point out that Erlang is asynchronous. You send your message and maybe/someday another message arrives back. Or not.

Spencer's point about out of order execution is also important and well answered.

like image 181
Godeke Avatar answered Sep 20 '22 01:09

Godeke


The message queue system is cool because it effectively produces a "fire-and-wait-for-result" effect which is the synchronous part you're reading about. What makes this incredibly awesome is that it means lines do not need to be executed sequentially. Consider the following code:

r = methodWithALotOfDiskProcessing(); x = r + 1; y = methodWithALotOfNetworkProcessing(); w = x * y 

Consider for a moment that methodWithALotOfDiskProcessing() takes about 2 seconds to complete and that methodWithALotOfNetworkProcessing() takes about 1 second to complete. In a procedural language this code would take about 3 seconds to run because the lines would be executed sequentially. We're wasting time waiting for one method to complete that could run concurrently with the other without competing for a single resource. In a functional language lines of code don't dictate when the processor will attempt them. A functional language would try something like the following:

Execute line 1 ... wait. Execute line 2 ... wait for r value. Execute line 3 ... wait. Execute line 4 ... wait for x and y value. Line 3 returned ... y value set, message line 4. Line 1 returned ... r value set, message line 2. Line 2 returned ... x value set, message line 4. Line 4 returned ... done. 

How cool is that? By going ahead with the code and only waiting where necessary we've reduced the waiting time to two seconds automagically! :D So yes, while the code is synchronous it tends to have a different meaning than in procedural languages.

EDIT:

Once you grasp this concept in conjunction with Godeke's post it's easy to imagine how simple it becomes to take advantage of multiple processors, server farms, redundant data stores and who knows what else.

like image 21
Spencer Ruport Avatar answered Sep 20 '22 01:09

Spencer Ruport