In Linux, when you make a blocking i/o call like read or accept, what actually happens?
My thoughts: the process get taken out of the run queue, put into a waiting or blocking state on some wait queue. Then when a tcp connection is made (for accept) or the hard drive is ready or something for a file read, a hardware interrupt is raised which lets those processes waiting to wake up and run (in the case of a file read, how does linux know what processes to awaken, as there could be lots of processes waiting on different files?). Or perhaps instead of hardware interrupts, the individual process itself polls to check availability. Not sure, help?
The block I/O layer is the kernel subsystem in charge of managing input/output operations performed on block devices. The need for a specific kernel component for managing such operations is given by the additional complexity of block devices with respect to, for example, character devices.
Non-blocking IO under the hood Most non-blocking frameworks use an infinite loop that constantly checks (polls) if data is returned from IO. This is often called the event loop. An event loop is literally a while(true) loop that in each iteration will check if data is ready to read from a network socket.
Some I/O requests are blocked, which means control does not return to the application until the I/O task is over. In such cases, the CPU has to run multiple threads/processes to complete other tasks.
Well blocking IO means that a given thread cannot do anything more until the IO is fully received (in the case of sockets this wait could be a long time). Non-blocking IO means an IO request is queued straight away and the function returns. The actual IO is then processed at some later point by the kernel.
Each Linux device seems to be implemented slightly differently, and the preferred way seems to vary every few Linux releases as safer/faster kernel features are added, but generally:
A typical example (slightly simplified):
In the driver at initialisation:
init_waitqueue_head(&readers_wait_q);
In the read function of a driver:
if (filp->f_flags & O_NONBLOCK)
{
return -EAGAIN;
}
if (wait_event_interruptible(&readers_wait_q, read_avail != 0))
{
/* signal interrupted the wait, return */
return -ERESTARTSYS;
}
to_copy = min(user_max_read, read_avail);
copy_to_user(user_buf, read_ptr, to_copy);
Then the interrupt handler just issues:
wake_up_interruptible(&readers_wait_q);
Note that wait_event_interruptible() is a macro that hides a loop that checks for a condition - read_avail != 0
in this case - and repeatedly adds to the wait queue again if woken when the condition is not true.
As mentioned there are a number of variations - the main one is that if there is potentially a lot of work for the interrupt handler to do then it does the bare minimum itself and defers the rest to a work queue or tasklet (generally known as the "bottom half") and it is this that would wake the waiting threads.
See Linux Device Driver book for more details - pdf available here: http://lwn.net/Kernel/LDD3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With