I am reading this tutorial on asynchronous disk file I/O, however it doesn't make things clear, it actually makes me more confused. There are two different async. I/O models according to the tutorial: <ol> <li> Asynchronous blocking I/O where you open a file with <code>O_ASYNC</code>, then use <code>epoll</code>/<code>poll</code>/<code>select</code>. </li> <li> Asynchronous IO using glibc's AIO </li> </ol> Since glibc implements AIO with a thread pool, what I am referring to in this question with "AIO" is rather kernel AIO, i.e. <code>io_submit</code> At least from a conceptual point of view, there seems to be no big difference -- <code>io_submit</code> can let you issue multiple I/O requests, while on the other hand, using <code>read</code> with <code>O_ASYNC</code> you can just issue one request with a file position. This guide also mentions using <code>epoll</code> as an alternative to Linux AIO: <blockquote> epoll. Linux has limited support for using <code>epoll</code> as a mechanism for asynchronous I/O. For reads to a file opened in buffered mode (that is, without <code>O_DIRECT</code>), if the file is opened as <code>O_NONBLOCK</code>, then a read will return EAGAIN until the relevant part is in memory. Writes to a buffered file are usually immediate, as they are written out with another writeback thread. However, these mechanisms don’t give the level of control over I/O that direct I/O gives. </blockquote> What is the issue of using <code>epoll</code> as an AIO alternative? Or in other words, what is the problem that we need [the new interface] <code>io_submit</code> to solve?

To my opinion, the critical issue behind the io_* api is the ability to achieve higher IO throughput through 2 main measures: <ol> <li>Minimization of number of system calls in the application IO loop. Multiple request batches can be submitted, then, at some later time, application can return to examine the outcomes of individual requests in one go using io_getevents(). Importantly, io_getevents() will return information on each individual IO transaction, rather than a vague "fd x has pending changes" bit of info returned by epoll() on each invocation.</li> <li>Kernel IO scheduler can rely on request reordering to make better use of the hardware. Application may even pass down some tips on how to reorder the requests using aio_reqprio field in struct iocb. Necessarily, if we allow reordering of IO requests, we need to supply an application with appropriate API to query, whether some particular high priority requests are already completed (thus io_getevents()).</li> </ol> It can be said, that io_getevents() is the really important piece of functionality, whereupon io_submit() is a handy companion to make efficient use of it.

what is difference between io_submit and file with O_ASYNC

Tags:

linux

asynchronous

io

linux-kernel

aio

I am reading this tutorial on asynchronous disk file I/O, however it doesn't make things clear, it actually makes me more confused.

There are two different async. I/O models according to the tutorial:

Asynchronous blocking I/O where you open a file with O_ASYNC, then use epoll/poll/select.
Asynchronous IO using glibc's AIO

Since glibc implements AIO with a thread pool, what I am referring to in this question with "AIO" is rather kernel AIO, i.e. io_submit

At least from a conceptual point of view, there seems to be no big difference -- io_submit can let you issue multiple I/O requests, while on the other hand, using read with O_ASYNC you can just issue one request with a file position.

This guide also mentions using epoll as an alternative to Linux AIO:

epoll. Linux has limited support for using epoll as a mechanism for asynchronous I/O. For reads to a file opened in buffered mode (that is, without O_DIRECT), if the file is opened as O_NONBLOCK, then a read will return EAGAIN until the relevant part is in memory. Writes to a buffered file are usually immediate, as they are written out with another writeback thread. However, these mechanisms don’t give the level of control over I/O that direct I/O gives.

What is the issue of using epoll as an AIO alternative? Or in other words, what is the problem that we need [the new interface] io_submit to solve?

675

asked May 05 '13 23:05

Chang

1 Answers

To my opinion, the critical issue behind the io_* api is the ability to achieve higher IO throughput through 2 main measures:

Minimization of number of system calls in the application IO loop. Multiple request batches can be submitted, then, at some later time, application can return to examine the outcomes of individual requests in one go using io_getevents(). Importantly, io_getevents() will return information on each individual IO transaction, rather than a vague "fd x has pending changes" bit of info returned by epoll() on each invocation.
Kernel IO scheduler can rely on request reordering to make better use of the hardware. Application may even pass down some tips on how to reorder the requests using aio_reqprio field in struct iocb. Necessarily, if we allow reordering of IO requests, we need to supply an application with appropriate API to query, whether some particular high priority requests are already completed (thus io_getevents()).

It can be said, that io_getevents() is the really important piece of functionality, whereupon io_submit() is a handy companion to make efficient use of it.

answered Oct 05 '22 11:10

oakad

Related questions
                            
                                How to get notifications for SD card events?
                            
                                performance impact of "hot" and "inline" combination for a function definition
                            
                                Adding watches to Inotify in multi-threaded program
                            
                                What is the simplest way to create several HSQLDB server databases? [closed]
                            
                                Linux :Identifying pages in memory
                            
                                How can I translate Linux keycodes from /dev/input/event* to ASCII
                            
                                How is the initial value of the stack pointer determined?
                            
                                Cross-compile to Linux from OS X
                            
                                CURL: How retain cookies between requests?
                            
                                waitpid and pthread_cond_wait(3)
                            
                                create a ramdisk in C++ on linux
                            
                                Simple cache profiling API
                            
                                Connection keep-alive problems
                            
                                Half-duplex serial communications in Python
                            
                                Shell redirection and file I/O durations
                            
                                Serial communication with Arduino only works while the screen is running
                            
                                How to keep parent and child process on same core
                            
                                Duplicated memory management symbols in libc.so and ld-linux.so
                            
                                Combining existing rootfs with custom toolchain
                            
                                Segmentation fault: 0x0000000000000001 in ?? () trying to compile/link under Linux

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With