Can someone explain to me how event-driven IO system calls like select, poll, and epoll relate to blocking vs non-blocking IO?
I don't understand how related -- if at all, these concepts are
The select
system call is supported in almost all Unixes and provides means for userland applications to watch over a group of descriptors and get information about which subset of this group is ready for reading/writing. Its particular interface is a bit clunky and the implementation in most kernels is mediocre at best.
epoll
is provided only in Linux for the same purpose, but is a huge improvement over select
in terms of efficiency and programming interface. Other Unixes have their specialised calls too.
That said, the event-driven IO system calls do not require either blocking or non-blocking descriptors. Blocking is a behaviour that affects system calls like read
, write
, accept
and connect
. select
and epoll_wait
do have blocking timeouts, but that is something unrelated to the descriptors.
Of course, using these event-driven system calls with blocking descriptors is a bit odd because you would expect that you can immediately read the data without blocking after you have been notified that it is available. Always relying that a blocking descriptor won't block after you have been notified for its readiness is a bit risky because race conditions are possible.
Non-blocking, event-driven IO can make server applications vastly more efficient because threads are not needed for each descriptor (connection). Compare the Apache web server to Nginx or Lighttpd in terms of performance and you'll see the benefit.
They're largely unrelated, except that you may want to use non-blocking file descriptors with event-driven IO for the following reasons:
Old versions of Linux definitely have bugs in the kernel where read
can block even after select
indicated a socket was readable (it happened with UDP sockets and packets with bad checksums). Current versions of Linux may still have some such bugs; I'm not sure.
If there's any possibility that other processes have access to your file descriptors and will read/write to them, or if your program is multi-threaded and other threads might do so, then there is a race condition between select
determining that the file descriptor is readable/writable and your program performing IO on it, which could result in blocking.
You almost surely want to make a socket non-blocking before calling connect
; otherwise you'll block until the connection is made. Use select
for writing to determine when it's successfully connected, and select
for errors to determine if the connection failed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With