Could someone explain what the difference is between epoll
, poll
and threadpool?
epoll
and poll
are Linux-specific... Is there an equivalent alternative for Windows?epoll is a Linux-specific enhancement of poll(). It hides the details of representation in an internal data structure that is manipulated using a special function epoll_create(). It's one of the newer variants developed for high-performance servers with many active connections...
performance: select & poll vs epoll So using epoll really is a lot faster once you have more than 10 or so file descriptors to monitor.
The main difference between epoll and select is that in select() the list of file descriptors to wait on only exists for the duration of a single select() call, and the calling task only stays on the sockets' wait queues for the duration of a single call.
epoll is a Linux kernel system call for a scalable I/O event notification mechanism, first introduced in version 2.5. 44 of the Linux kernel. Its function is to monitor multiple file descriptors to see whether I/O is possible on any of them.
Threadpool does not really fit into the same category as poll and epoll, so I will assume you are referring to threadpool as in "threadpool to handle many connections with one thread per connection".
epoll
, though the obvious way (all threads block on epoll_wait
) is of no use, because epoll will wake up every thread waiting on it, so it will still have the same issues. futex
is your friend here, in combination with e.g. a fast forward queue per thread. Although badly documented and unwieldy, futex
offers exactly what's needed. epoll
may return several events at a time, and futex
lets you efficiently and in a precisely controlled manner wake N blocked threads at a time (N being min(num_cpu, num_events)
ideally), and in the best case it does not involve an extra syscall/context switch at all. fork
(a.k.a. old fashion threadpool) fork
is not "free", although the overhead is mostly coalesced by the copy-on-write mechanism. On large datasets which are also modified, a considerable number of page faults following fork
may negatively impact performance.poll
/ select
epoll
epoll_ctl
) epoll_wait
) poll
workstimerfd
and eventfd
(stunning timer resolution and accuracy, too).signalfd
, eliminating the awkward handling of signals, making them part of the normal control flow in a very elegant manner.eventfd
, but requires a (to date) undocumented function.poll
may perform equally or better.epoll
cannot do "magic", i.e. it is still necessarily O(N) in respect to the number of events that occur.epoll
plays well with the new recvmmsg
syscall, since it returns several readiness notifications at a time (as many as are available, up to whatever you specify as maxevents
). This makes it possible to receive e.g. 15 EPOLLIN notifications with one syscall on a busy server, and read the corresponding 15 messages with a second syscall (a 93% reduction in syscalls!). Unluckily, all operations on one recvmmsg
invokation refer to the same socket, so it is mostly useful for UDP based services (for TCP, there would have to be a kind of recvmmsmsg
syscall which also takes a socket descriptor per item!).EAGAIN
even when using epoll
because there are exceptional situations where epoll
reports readiness and a subsequent read (or write) will still block. This is also the case for poll
/select
on some kernels (though it has presumably been fixed).EAGAIN
is returned upon receiving a notification, it is possible to indefinitely read new incoming data from a fast sender while completely starving a slow sender (as long as data keeps coming in fast enough, you might not see EAGAIN
for quite a while!). Applies to poll
/select
in the same manner.epoll_wait
(or since the descriptor was opened, if there was no previous call).epoll_wait
, signalling that IO activity has happened since anyone last called either epoll_wait
or a read/write function on the descriptor, and thereafter only reports readiness again to the next thread calling or already blocked in epoll_wait
, for any operations happening after anyone called a of read (or write) function on the descriptor". It kind of makes sense, too... it just isn't exactly what the documentation suggests.kqueue
epoll
, different usage, similar effect.libevent -- The 2.0 version also supports completion ports under Windows.
ASIO -- If you use Boost in your project, look no further: You already have this available as boost-asio.
The frameworks listed above come with extensive documentation. The Linux docs and MSDN explains epoll and completion ports extensively.
Mini-tutorial for using epoll:
int my_epoll = epoll_create(0); // argument is ignored nowadays epoll_event e; e.fd = some_socket_fd; // this can in fact be anything you like epoll_ctl(my_epoll, EPOLL_CTL_ADD, some_socket_fd, &e); ... epoll_event evt[10]; // or whatever number for(...) if((num = epoll_wait(my_epoll, evt, 10, -1)) > 0) do_something();
Mini-tutorial for IO completion ports (note calling CreateIoCompletionPort twice with different parameters):
HANDLE iocp = CreateIoCompletionPort(INVALID_HANDLE_VALUE, 0, 0, 0); // equals epoll_create CreateIoCompletionPort(mySocketHandle, iocp, 0, 0); // equals epoll_ctl(EPOLL_CTL_ADD) OVERLAPPED o; for(...) if(GetQueuedCompletionStatus(iocp, &number_bytes, &key, &o, INFINITE)) // equals epoll_wait() do_something();
(These mini-tuts omit all kind of error checking, and hopefully I didn't make any typos, but they should for the most part be ok to give you some idea.)
EDIT:
Note that completion ports (Windows) conceptually work the other way around as epoll (or kqueue). They signal, as their name suggests, completion, not readiness. That is, you fire off an asynchronous request and forget about it until some time later you're told that it has completed (either successfully nor not so much successfully, and there is the exceptional case of "completed immediately" too).
With epoll, you block until you are notified that either "some data" (possibly as little as one byte) has arrived and is available or there is sufficient buffer space so you can do a write operation without blocking. Only then, you start the actual operation, which then will hopefully not block (other than you would expect, there is no strict guarantee for that -- it is therefore a good idea to set descriptors to nonblocking and check for EAGAIN [EAGAIN and EWOULDBLOCK for sockets, because oh joy, the standard allows for two different error values]).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With