Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does list I/O writev internally work?

Tags:

c

io

posix

The writev function takes an array of struct iovec as input argument

writev(int fd, const struct iovec *iov, int iovcnt);

The input is a list of memory buffers that need to be written to a file (say). What I want to know is:

Does writev internally do this:

for (each element in iov) write(element)

such that every element of iov is written to file in a separate I/O call? Or does writev write everything to file in a single I/O call?

like image 564
jitihsk Avatar asked Feb 17 '12 22:02

jitihsk


2 Answers

Per the standards, the for loop you mentioned is not a valid implementation of writev, for several reasons:

  1. The loop could fail to finish writing one iov before proceeding to the next, in the event of a short write - but this could be worked around by making the loop more elaborate.
  2. The loop could have incorrect behavior with respect to atomicity for pipes: if the total write length is smaller than PIPE_BUF, the pipe write is required to be atomic, but the loop would break the atomicity requirement. This issue cannot be worked around except by moving all the iov entries into a single buffer before writing when the total length is at most PIPE_BUF.
  3. The loop might have cases where it could result in blocking, where the single writev call would be required to perform a partial write without blocking. As far as I know, this issue would be impossible to work around in the general case.
  4. Possibly other reasons I haven't thought of.

I'm not sure about point #3, but it definitely exists in the opposite direction, when reading. Calling read in a loop could block if a terminal has some data (shorter than the total iov length) available followed by an EOF indicator; calling readv should return immediately with a partial read in this case. However, due to a bug in Linux, readv on terminals is actually implemented as a read loop in kernelspace, and it does exhibit this blocking bug. I had to work around this bug in implementing musl's stdio:

http://git.etalabs.net/cgi-bin/gitweb.cgi?p=musl;a=commit;h=2cff36a84f268c09f4c9dc5a1340652c8e298dc0

To answer the last part of your question:

Or does writev write everything to file in a single I/O call?

In all cases, a conformant writev implementation will be a single syscall. Getting down to how it's implemented on Linux: for ordinary files and for most devices, the underlying file driver has methods that implement iov-style io directly, without any sort of internal loop. But the terminal driver on Linux is highly outdated and lacks the modern io methods, causing the kernel to fallback to a write/read loop for writev/readv when operating on a terminal.

like image 105
R.. GitHub STOP HELPING ICE Avatar answered Sep 22 '22 18:09

R.. GitHub STOP HELPING ICE


The direct way to know how code works is read the source code.

see http://www.oschina.net/code/explore/glibc-2.9/sysdeps/posix/writev.c

It simplely alloca() or malloc() a buffer, copy all vectors into it, and call write() once.

That how it works. Nothing mysterious.

like image 26
HardySimpson Avatar answered Sep 22 '22 18:09

HardySimpson