Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How many CPUs are needed before Erlang is faster than single-threaded Java [closed]

I am currently using Java, I've read a lot about Erlang on the net, and I have 2 big questions:

  1. How much slower (if any) will Erlang be over simple Java?
    I'm assuming here that Java is going to be faster from the shootout benchmarks on the net (Erlang doesn't do that well). So, how many more CPUs am I going to need to make Erlang shine over single-threaded Java (in my particular situation, given below)?

  2. After reading around about Erlang for a while I've hit on a number of comments/posts that say that most large Erlang systems contain a good amount of C/C++.
    Is this for speed reasons (my assumption) or something else? i.e. Why is this required?

I have read about the number of processors in most machines going up and threading models being hard (I agree) but I am looking to find out when the "line" is going to be crossed so that I can change language/paradigm at the right time.

A bit of background/context:
I am working server-side on Java services which are very CPU-bound and easily made parallel. This is due to, typically, a single incoming update (via TCP) triggering a change to multiple (100s of) outputs.

The calculations are typically pretty simple (few loops, just lots of arithmetic) and the inputs are coming in pretty fast (100/s).

Currently we are running on 4 CPU machines and running multiple services on each (so multi-threading is pretty pointless and Java seems to run faster without the sync blocks, etc required to make it multi-threaded). There is now a strong push for speed and we now have access to 24 processor machines (per process if required) so I am wondering how best to proceed - massively multi-threaded Java or something easier to code, like Erlang.

like image 652
DaveC Avatar asked Jan 03 '10 23:01

DaveC


People also ask

Is Erlang single threaded?

Erlang code is single threaded. Each Erlang "process" is an isolated, share-nothing single sequence of instructions, and you don't use semaphores, locks, or critical sections in writing Erlang code.

Which is better single thread or multi thread?

Advantages of Multithreaded Processes All the threads of a process share its resources such as memory, data, files etc. A single application can have different threads within the same address space using resource sharing. It is more economical to use threads as they share the process resources.

Is single thread performance more important than multi thread?

(Not 100% sure) In modern gaming however single threaded performance is more important due to dx11. However when dx12 you should expect to see a 10%-15% improvement in multi-threaded performance.

Why is Erlang so fast?

A native-code compiler is available, and according to numerical benchmarks, it makes Erlang programs faster than Ruby, Perl, and PHP, albeit slower than Java and JavaScript.


1 Answers

since this is a arithmetic heavy workload and you have already done the job of splitting out the code into seperate service processes, you wouldn't gain much from Erlang. Your job seems to fit Java comfortably. Erlang is good at tiny transactions -- such as msg switching or serving static or simple-dynamic web-pages. Not -- inately at enterprise number-crunching or database workload.

However, you could build on external numerical libraries and databases and use Erlang as a MSG switch :D that's what couch-db does :P

-- edit --

  1. If you move your arithmetic operations into an Erlang async-IO driver erlang will be just as good as the language shoot-out stuff -- but with 24 cpu's perhaps it won't matter that much; the erlang database is procedural and thefore quite fast -- this can be exploited in your application updating 100 entities on each transaction.

  2. The erlang runtime system needs to be a mix of C and C++ because (a) the erlang emulator is written in C/C++ (you have to start somewhere), (b) you have to talk to the kernel to do async file io and network io, and (c) certain parts of the system need to be blistering fast --e.g., the backend of the database system (amnesia).

-- discussion --

with 24 CPU's in a 6 core * 4 CPU topology using a shared memory buss -- you have 4 NUMA entities (the CPUs) and one central memory. You need to be wise about the paradigm, the shared-nothing multi-process approach might kill your memory buss.

To get around this you need to create 4 processes with 6 processing threads and bind each processing thread the corresponding core in the corresponding CPU. These 6 threads need to do collaborative multi-threading -- Erlang and Lua have this innately -- Erlang does it in a hard-core way as it has a full-blown scheduler as part of its runtime which it can use to create as many processes as you want.

Now if you were to partition your tasks across the 4 processes (1 per physical CPU) you would be a happy man, however you are running 4 Java VM's doing (presumably) serious work (yuck, for many reasons). The problem needs to be solved with a better ability to slice and dice the problem.

In comes the Erlang OTP system, it was designed for redundant robust networked systems, but now it is moving towards same-machine NUMA-esque CPU's. It already has a kick-ass SMP emulator, and it will become NUMA aware as well soon. With this paradigm of programming you have a much better chance to saturate your powerful servers without killing your bus.

Perhaps this discussion has been theoretical; however, when you get a 8x8 or 16x8 topology you will be ready for it as well. So my answer is when you have more then 2 -- modern -- physical CPU's on your mainboard you should probably consider a better programming paradigm.

As an example of a major product following the discussion here: Microsoft's SQL Server is CPU-Level NUMA-aware in the SQL-OS layer on which the database engine is built.

like image 199
Hassan Syed Avatar answered Oct 21 '22 10:10

Hassan Syed