Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is epoll faster than select?

Tags:

select

epoll

I have seen a lot of comparisons which says select have to walk through the fd list, and this is slow. But why doesn't epoll have to do this?

like image 453
amazingjxq Avatar asked Jun 28 '13 00:06

amazingjxq


People also ask

Is poll faster than select?

poll( ) is more efficient for large-valued file descriptors. Imagine watching a single file descriptor with the value 900 via select()—the kernel would have to check each bit of each passed-in set, up to the 900th bit.

What is the difference between poll and epoll?

Both of them potentially trigger on a single event. However, with poll, the user then has no choice but to iterate thru the entire list submitted to find the events, whereas with epoll you get a list returned containing only the actual events.

Why is epoll used?

epoll stands for event poll and is a Linux specific construct. It allows for a process to monitor multiple file descriptors and get notifications when I/O is possible on them. It allows for both edge-triggered as well as level-triggered notifications.

What is select poll epoll?

Unlike select and poll both of which only provide one API, epoll is not a single API but a group of 3 APIs. epoll_create and epoll_add are called to set up the epoll instance while epoll_wait can be called in a loop to constantly wait on the fds added by epoll_add . The complexity of the inner loop is O(ready fds) .


Video Answer


2 Answers

There's a lot of misinformation about this, but the real reason is this:

A typical server might be dealing with, say, 200 connections. It will service every connection that needs to have data written or read and then it will need to wait until there's more work to do. While it's waiting, it needs to be interrupted if data is received on any of those 200 connections.

With select, the kernel has to add the process to 200 wait lists, one for each connection. To do this, it needs a "thunk" to attach the process to the wait list. When the process finally does wake up, it needs to be removed from all 200 wait lists and all those thunks need to be freed.

By contrast, with epoll, the epoll socket itself has a wait list. The process needs to be put on only that one wait list using only one thunk. When the process wakes up, it needs to be removed from only one wait list and only one thunk needs to be freed.

To be clear, with epoll, the epoll socket itself has to be attached to each of those 200 connections. But this is done once, for each connection, when it is accepted in the first place. And this is torn down once, for each connection, when it is removed. By contrast, each call to select that blocks must add the process to every wait queue for every socket being monitored.

Ironically, with select, the largest cost comes from checking if sockets that have had no activity have had any activity. With epoll, there is no need to check sockets that have had no activity because if they did have activity, they would have informed the epoll socket when that activity happened. In a sense, select polls each socket each time you call select to see if there's any activity while epoll rigs it so that the socket activity itself notifies the process.

like image 158
David Schwartz Avatar answered Sep 22 '22 02:09

David Schwartz


The main difference between epoll and select is that in select() the list of file descriptors to wait on only exists for the duration of a single select() call, and the calling task only stays on the sockets' wait queues for the duration of a single call. In epoll, on the other hand, you create a single file descriptor that aggregates events from multiple other file descriptors you want to wait on, and so the list of monitored fd's is long-lasting, and tasks stay on socket wait queues across multiple system calls. Furthermore, since an epoll fd can be shared across multiple tasks, it is no longer a single task on the wait queue, but a structure that itself contains another wait queue, containing all processes currently waiting on the epoll fd. (In terms of implementation, this is abstracted over by the sockets' wait queues holding a function pointer and a void* data pointer to pass to that function).

So, to explain the mechanics a little more:

  1. An epoll file descriptor has a private struct eventpoll that keeps track of which fd's are attached to this fd. struct eventpoll also has a wait queue that keeps track of all processes that are currently epoll_waiting on this fd. struct epoll also has a list of all file descriptors that are currently available for reading or writing.
  2. When you add a file descriptor to an epoll fd using epoll_ctl(), epoll adds the struct eventpoll to that fd's wait queue. It also checks if the fd is currently ready for processing and adds it to the ready list, if so.
  3. When you wait on an epoll fd using epoll_wait, the kernel first checks the ready list, and returns immediately if any file descriptors are already ready. If not, it adds itself to the single wait queue inside struct eventpoll, and goes to sleep.
  4. When an event occurs on a socket that is being epoll()ed, it calls the epoll callback, which adds the file descriptor to the ready list, and also wakes up any waiters that are currently waiting on that struct eventpoll.

Obviously, a lot of careful locking is needed on struct eventpoll and the various lists and wait queues, but that's an implementation detail.

The important thing to note is that at no point above there did I describe a step that loops over all file descriptors of interest. By being entirely event-based and by using a long-lasting set of fd's and a ready list, epoll can avoid ever taking O(n) time for an operation, where n is the number of file descriptors being monitored.

like image 30
elite21 Avatar answered Sep 23 '22 02:09

elite21