Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TCP Server w/ boost::asio, scalability of thread pool vs stackless coroutines

I'm building a TCP-based daemon for pre-/post-processing of HTTP requests. Clients will connect to Apache HTTPD (or IIS), and a custom Apache/IIS module will forward requests to my TCP daemon for further processing. My daemon will need to scale up (but not out) to handle significant traffic, and most requests will be small and short-lived. The daemon will be built in C++, and must be cross-platform.

I'm currently looking at the boost asio libraries, which seem like a natural fit. However, I'm having trouble understanding the merits of the stackless coroutines vs thread pool pattern. Specifically, I'm looking at HTTP server example #3 and HTTP server example #4 here: http://www.boost.org/doc/libs/1_49_0/doc/html/boost_asio/examples.html

Despite all of my googling, I'm unable to fully comprehend the merits of the stackless coroutine server, and how it would perform relative to the thread pool server on a multi-core system.

Which of the two is most appropriate given my requirements, and why? Please, feel free to 'dumb down' your answers regarding the stackless coroutine idea, I'm still on shaky ground here. Thanks!

Edit: Another random thought/concern for discussion: Boost HTTP server example #4 is described as "a single-threaded HTTP server implemented using stackless coroutines". OK, so it's entirely single-threaded (right? even after the parent process 'forks' to a child? see server.cpp in example #4)...will the single thread become a bottleneck on a multi-core system? I'm assuming that any blocking operations will prevent all other requests from executing. If this is indeed the case, to maximize throughput I'm thinking a coroutine-based receive-data async event, a thread pool for my internal blocking tasks (to leverage multi cores), and then an async send & close connection mechanism. Again, scalability is critical. Any thoughts?

like image 808
Tom Avatar asked May 11 '12 14:05

Tom


1 Answers

I have recently looked at the scalability of boost.asio on multi-core machines. The main conclusion so far is that it does introduce overhead, lock contention and additional context switches (at least on Linux), see some of my blog posts on these topics:

  • http://cmeerw.org/blog/748.html#748
  • http://cmeerw.org/blog/751.html#751

I also started a thread on the asio mailing list to check that I haven't missed anything obvious, see http://comments.gmane.org/gmane.comp.lib.boost.asio.user/5133

If your main concerns are performance and scalability then I am afraid, that there is no clear-cut answer - you might have to do some prototyping and look at the performance.

If you have any blocking operations then you would definitely want to use multiple threads - on the other hand, context switching and lock contention can decrease performance with multiple threads (at least you will have to be very careful).

Edit: just to clarifly the stackless coroutines stuff: it's essentially just some syntactic sugar to make the asynchronous API look a bit more like sequential/blocking calls.

like image 157
cmeerw Avatar answered Nov 09 '22 06:11

cmeerw