Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting to know the basics of Asynchronous programming on *nix

For some time now I have been googling a lot to get to know about the various ways to acheive asynchronous programming/behavior on nix machines and ( as known earlier to me ) got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).

Below are the few alternatives present for linux:

  1. select/poll/epoll :: Cannot be done using single thread as epoll is still blocking call. Also the monitored file descriptors must be opened in non-blocking mode.
  2. libaio:: What I have come to know about is that its implementation sucks and its still notification based instead of being completion based as with windows I/O completion ports.
  3. Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern
  4. libevent :: Any reason to go for it if I prefer ASIO?

Now Here comes the questions :)

  1. What would be the best design pattern for writing fast scalable network server using epoll (ofcourse, will have to use threads here :( )
  2. I had read somewhere that "only sockets can be opened in non-blocking mode" hence epoll supports only sockets and hence cannot be used for disk I/O. How true is the above statement and why async programming cannot be done on disk I/O using epoll ?
  3. Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself. Similar question
  4. How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern ?

Hope somebody will able to answer all the questions with nice explanations also. Any link to source where the implementation details of epoll and AIO design patterns are exaplained is also appreciated.

like image 529
Arunmu Avatar asked Jan 08 '12 08:01

Arunmu


People also ask

What is asynchronous programming explain in brief?

Asynchronous programming is a technique that enables your program to start a potentially long-running task and still be able to be responsive to other events while that task runs, rather than having to wait until that task has finished. Once that task has finished, your program is presented with the result.

Is asynchronous programming difficult?

Asynchronous programming can be difficult, though, since you must comprehend the many ways in which your code will run. The distinctions between synchronous and asynchronous programming in various languages will be discussed in this article, along with explaining how either technique might help your project.

How asynchronous programming is implemented?

Asynchronous programming is a form of parallel programming that allows a unit of work to run separately from the primary application thread. When the work is complete, it notifies the main thread (as well as whether the work was completed or failed).

How do you write asynchronous code?

The asynchronous code will be written in three ways: callbacks, promises, and with the async / await keywords. Note: As of this writing, asynchronous programming is no longer done using only callbacks, but learning this obsolete method can provide great context as to why the JavaScript community now uses promises.


2 Answers

Boost ASIO :: It uses epoll under linux and thus not a true async pattern as it spawns thread which are completely abstracted from user code to acheive the proactor design pattern

This is not correct. The Asio library uses epoll() by default on most recent Linux kernel versions. however, threads invoking io_service::run() will invoke callback handlers as needed. There is only one place in the Asio library that a thread is used to emulate an asynchronous interface, it is well described in the documentation:

An additional thread per io_service is used to emulate asynchronous host resolution. This thread is created on the first call to either ip::tcp::resolver::async_resolve() or ip::udp::resolver::async_resolve().

This does not make the library "not a true async pattern" as you claim, in fact its name would disagree with you by definition.

1) What would be the best design pattern for writing fast scalable network server using epoll (of course, will have to use threads here :( )

I suggest using Boost Asio, it uses the proactor design pattern.

3) Boost ASIO uses one big lock around epoll call. I didnt actually understand what can be its implications and how to overcome it using asio itself

The epoll reactor uses a mutex to dispatch handlers, though in practice this is not a big concern for most applications. There are application specific ways to mitigate this behavior, such as an io_service per CPU to exploit data locality. See my answer to a similar question on this topic. It is also discussed on the Asio mailing list frequently.

4) How can I modify ASIO pattern to work with disk files? Is there any recommended design pattern?

The Asio library does not natively support file I/O as you noted. There have been several attempts to add it to the library, I'd suggest discussing on the mailing list.

like image 121
Sam Miller Avatar answered Sep 27 '22 16:09

Sam Miller


First of all:

got confirmed on the fact that there is still no TRULY async pattern (concurrency using single thread) for Linux as available for Windows(IOCP).

You probably has a small misconception, asynchronous can be build on top of "polling" api.

More then that "reactor" (epoll-like) API is more powerful then "proactor" API (IOCP) as the second can be implemented in terms of the first one (but not the other way around).

Also some operations that are "truly" asynchronous for example like disk I/O, some some other tools can be with combination of signals and Linux specific signalfd can provide full coverage of some other cases.

Bottom line. epoll is truly asynchronous I/O

like image 44
Artyom Avatar answered Sep 27 '22 17:09

Artyom