Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Thread-ring benchmark

Today I was doing the thread ring exercise from the Programming Erlang book and googled for other solutions to compare. I found that the language shootout has exactly the same problem as a benchmark. I had the impression that this is an area where Erlang should be fast, but turns out that C and C++ are again on top. My suspicion is that the C/C++ programs are not following the rules which say "pass the token from thread to thread". It seems, after reading them, that they both manipulate some shared memory and global variables which is different from the Erlang code but I could be wrong.

My question is: are they doing the same thing or the C/C++ code is conceptually different (and faster) from the Erlang one?

And another question: why is Haskell faster than Erlang when the solutions are very similar?

like image 965
Daniel Avatar asked Aug 20 '10 14:08

Daniel


4 Answers

The C version is using LWP, which I think is a user-space threading library. To what extent this is "unfair" is up for debate: I'd look at things like whether it supports true pre-emptive concurrency in the sense that you can make blocking system calls in a thread without blocking all the other threads (you can do this in Haskell, can you in Erlang?).

Haskell's threads are slightly more lightweight than Erlang's, because as I understand it an Erlang thread comes with a local heap (in the standard implementation) whereas Haskell threads all share the same heap.

like image 189
Simon Marlow Avatar answered Oct 22 '22 02:10

Simon Marlow


Ultimately, message-passing on modern machines is implemented using some form of shared memory to pass the messages (along with either locks or atomic instructions). So all the C and C++ implementations are really doing is inlining the implementation of message-passing straight into their code. A similar benchmark that uses a fast message-passing library in C, also benchmarked against Haskell and Erlang, can be found in this paper: http://www.cs.kent.ac.uk/pubs/2009/2928/index.html (section 5.1)

The speed of the various approaches is really determined by the concurrent run-time systems involved. Haskell has had a lot of good work done in this area, which leaves it ahead of Erlang. Of course, measuring speed on micro-benchmarks is often mis-leading, and leaves out important factors like the readability of the code. A question to bear in mind might be: which of the solutions in the shoot-out would you be happy to maintain?

like image 24
Neil Brown Avatar answered Oct 22 '22 02:10

Neil Brown


I don't think I'd call it cheating. The primary, fundamental difference between multiple threads and multiple processes is that multiple threads share a single address space. As such, specifying multiple threads (rather than multiple processes) seems to me like tacit permission to take advantage of the shared address space (at least in the absence of some very specific definition of "passing" that prohibited this).

What it comes down to is this: Erlang doesn't really have threads, as such -- it has processes with asynchronous communications. The processes are (intentionally) isolated from each other to a large degree. On one hand, this makes development considerably easier -- in particular, one process can only affect another via specific, well-defined channels of communication. Under the hood, it uses lots of tricks (almost certainly including shared memory) to optimize its processes and take advantage of what's possible in a specific implementation/situation (such as all the processes running in a single, shared address space). Nonetheless, having to keep all the tricks hidden keeps it from being quite as efficient as something like the C version where the "tricks" are all explicit and completely exposed.

I'd use a real-life analogy to explain the difference. Think of the threads/processes as people. Erlang enforces a professional working relationship where communications are all carefully recorded in memos. The C and C++ versions are more like a husband and wife who might communicate with a single word that doesn't mean anything to anybody else, or even just a single quick glance.

The latter is extremely efficient when it works -- but it's a lot more open to subtle misunderstandings, and if the two have a fight you probably don't want to be in the same room. For the manager, people in purely professional relationships are a lot easier to manage even if their peak efficiency isn't quite a high.

like image 5
Jerry Coffin Avatar answered Oct 22 '22 04:10

Jerry Coffin


why is Haskell faster than Erlang when the solution are very similar?

Haskell GHC is a compiled, native-code optimized language implementation with very fast threads. It is generally much faster than Erlang/HIPE. Erlang doesn't have a monopoly on lightweight threads :-)

like image 4
Don Stewart Avatar answered Oct 22 '22 04:10

Don Stewart