How scalable is distributed Erlang?

Tags:

Part A:

Erlang has a lot of success stories about running concurrent agents e.g. the millions of simultaneous Facebook chats. That's millions of agents, but of course it's not millions of CPUs across a network. I'm having trouble finding metrics on how well Erlang scales when scaling is "horizontal" across a LAN/WAN.

Let's assume that I have many (tens of thousands) physical nodes (running Erlang on Linux) that need to communicate and synchronize small infrequent amounts of data across the LAN/WAN. At what point will I have communications bottlenecks, not between agents, but between physical nodes? (Or will this just work, assuming a stable network?)

Part B:

I understand (as an Erlang newbie, meaning I could be totally wrong) that Erlang nodes attempt to all connect to and be aware of each other, resulting in an N^2 connection point-to-point network. Assuming that part A won't just work with N = 10K's, can Erlang be configured easily (using out-of-the-box config or trivial boilerplate, not writing a full implementation of grouping/routing algorithms myself) to cluster nodes into manageable groups and route system -wide messages through the cluster/group hierarchy?

632

asked Feb 18 '11 17:02

G__

1 Answers

We should specify that we talk about horizontal scalability of physical machines -- that's the only problem. CPUs on one machine will be handled by one VM, no matter what the number of those is.

node = machine.

To begin, I can say that 30-60 nodes you get out of the box (vanilla OTP installation) with any custom application written on the top of that (in Erlang). Proof: ejabberd.

~100-150 is possible with optimized custom application. I means, it has to be good code, written with knowledge about GC, characteristic of data types, message passing etc.

over +150 is all right but when we talk about numbers like 300, 500 it will require optimizations & customizations of TCP layer. Also, our app has to be aware of cost of e.g. sync calls across the cluster.

The other thing is DB layer. Mnesia (built-in) due its features will not be effective over 20 nodes (my experience - I may be wrong). Solution: just use something else: dynamo DBs, separate cluster of MySQLs, HBase etc.

The most common technique to leverage cost of creating high quality application and scalability are federations of ~20-50 nodes clusters. So internally its an efficient mesh of ~50 erlang nodes and its connected via any suitable protocol with N another 50 nodes clusters. To sum up, such a system is federation of N erlang clusters.

Distributed erlang is designed to run in one data center. If you need more, geographically distant nodes, then use federations.

There are lots of config options e.g. which do not connect all nodes to each other. It may be helpful, however in ~50 cluster erlang overhead is not significant. Also you can create a graph of erlang nodes using 'hidden' connection, which doesn't join this full mesh, but also it cannot benefit from connection to all nodes.

The biggest problem I see, in this kind of systems, is designing it as master-less system. If you do not need that, everything should be ok.

196

answered Sep 22 '22 06:09

user425720

Related questions
                            
                                What are some interesting projects to solve in Erlang for learning purposes? [closed]
                            
                                TCP-based RPC server (Erlang or something similar?) for iOS/Android app communication
                            
                                Package management on Erlang and Elixir
                            
                                What is your experience with Nitrogen on Erlang?
                            
                                Hosting for erlang application
                            
                                How to format a number with padding in Erlang
                            
                                Reduce RabbitMQ memory usage
                            
                                Best practices/conventions for writing Erlang unit tests using eunit
                            
                                Elixir - is there a performance penalty using it instead of plain erlang?
                            
                                Garbage collection and memory management in Erlang
                            
                                Functional languages (Erlang, F#, Haskell, Scala) [closed]
                            
                                Why is useful to have a atom type (like in elixir, erlang)?
                            
                                Why is Erlang said to be more suited for server side programming in webgames than Java and C++?
                            
                                Achieving code swapping in Erlang's gen_server
                            
                                How do I run a beam file compiled by Elixir or Erlang?
                            
                                Ranges in Erlang
                            
                                In Erlang how do I convert a String to a binary value?
                            
                                Python/Erlang: What's the difference between Twisted, Stackless, Greenlet, Eventlet, Coroutines? Are they similar to Erlang processes?
                            
                                Failed to Create Cookie file RabbitMQ in Windows
                            
                                Windows x64 RabbitMQ install error with Erlang environment var (ERLANG_HOME)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How scalable is distributed Erlang?

Tags:

scalability

erlang

distributed

G__

People also ask

1 Answers

user425720

Recent Activity

Donate For Us