In scatter and gather (i.e. <code>readv</code> and <code>writev</code>), Linux reads into multiple buffers and writes from multiple buffers. If say, I have a vector of 3 buffers, I can use <code>readv</code>, OR I can use a single buffer, which is of combined size of 3 buffers and do <code>fread</code>. Hence, I am confused: For which cases should scatter/gather be used and when should a single large buffer be used?

The main convenience offered by <code>readv</code>, <code>writev</code> is: <ol> <li>It allows working with non contiguous blocks of data. i.e. buffers need not be part of an array, but separately allocated.</li> <li>The I/O is 'atomic'. i.e. If you do a <code>writev</code>, all the elements in the vector will be written in one contiguous operation, and writes done by other processes will not occur in between them.</li> </ol> e.g. say, your data is naturally segmented, and comes from different sources: <pre class="prettyprint"><code>struct foo *my_foo; struct bar *my_bar; struct baz *my_baz; my_foo = get_my_foo(); my_bar = get_my_bar(); my_baz = get_my_baz(); </code></pre> Now, all three 'buffers' are not one big contiguous block. But you want to write them contiguously into a file, for whatever reason (say for example, they are fields in a file header for a file format). If you use <code>write</code> you have to choose between: <ol> <li>Copying them over into one block of memory using, say, <code>memcpy</code> (overhead), followed by a single <code>write</code> call. Then the write will be atomic.</li> <li>Making three separate calls to <code>write</code> (overhead). Also, <code>write</code> calls from other processes can intersperse between these writes (not atomic).</li> </ol> If you use <code>writev</code> instead, its all good: <ol> <li>You make exactly one system call, and no <code>memcpy</code> to make a single buffer from the three.</li> <li>Also, the three buffers are written atomically, as one block write. i.e. if other processes also write, then these writes will not come in between the writes of the three vectors.</li> </ol> So you would do something like: <pre class="prettyprint"><code>struct iovec iov[3]; iov[0].iov_base = my_foo; iov[0].iov_len = sizeof (struct foo); iov[1].iov_base = my_bar; iov[1].iov_len = sizeof (struct bar); iov[2].iov_base = my_baz; iov[2].iov_len = sizeof (struct baz); bytes_written = writev (fd, iov, 3); </code></pre> Sources: <ol> <li>http://pubs.opengroup.org/onlinepubs/009604499/functions/writev.html</li> <li>http://linux.die.net/man/2/readv</li> </ol>

Linux: When to use scatter/gather IO (readv, writev) vs a large buffer with fread

1 Answers

The main convenience offered by readv, writev is:

It allows working with non contiguous blocks of data. i.e. buffers need not be part of an array, but separately allocated.
The I/O is 'atomic'. i.e. If you do a writev, all the elements in the vector will be written in one contiguous operation, and writes done by other processes will not occur in between them.

e.g. say, your data is naturally segmented, and comes from different sources:

struct foo *my_foo; struct bar *my_bar; struct baz *my_baz;  my_foo = get_my_foo(); my_bar = get_my_bar(); my_baz = get_my_baz();

Now, all three 'buffers' are not one big contiguous block. But you want to write them contiguously into a file, for whatever reason (say for example, they are fields in a file header for a file format).

If you use write you have to choose between:

Copying them over into one block of memory using, say, memcpy (overhead), followed by a single write call. Then the write will be atomic.
Making three separate calls to write (overhead). Also, write calls from other processes can intersperse between these writes (not atomic).

If you use writev instead, its all good:

You make exactly one system call, and no memcpy to make a single buffer from the three.
Also, the three buffers are written atomically, as one block write. i.e. if other processes also write, then these writes will not come in between the writes of the three vectors.

So you would do something like:

struct iovec iov[3];  iov[0].iov_base = my_foo; iov[0].iov_len = sizeof (struct foo); iov[1].iov_base = my_bar; iov[1].iov_len = sizeof (struct bar); iov[2].iov_base = my_baz; iov[2].iov_len = sizeof (struct baz);  bytes_written = writev (fd, iov, 3);

Sources:

http://pubs.opengroup.org/onlinepubs/009604499/functions/writev.html
http://linux.die.net/man/2/readv

answered Oct 13 '22 01:10

ArjunShankar

Related questions
                            
                                Linking against an old version of libc to provide greater application coverage
                            
                                Read and write to binary files in C?
                            
                                multiple websites on nginx & sites-available
                            
                                Difference between "system" and "exec" in Linux?
                            
                                How to compile .c file with OpenSSL includes?
                            
                                How do I divide in the Linux console?
                            
                                Best way to find os name and version in Unix/Linux platform
                            
                                Run process with realtime output in PHP
                            
                                C++11: How to alias a function? [duplicate]
                            
                                how to detect invalid utf8 unicode/binary in a text file
                            
                                Where can I find the Tomcat 7 installation folder on Linux AMI in Elastic Beanstalk?
                            
                                Automate mysql_secure_installation with echo command via a shell script
                            
                                xargs with multiple arguments
                            
                                How to clear the line number in Vim when copying?
                            
                                Randomly shuffling lines in Linux / Bash
                            
                                How to delete many 0 byte files in linux?
                            
                                Does malloc lazily create the backing pages for an allocation on Linux (and other platforms)?
                            
                                What is double dot(..) and single dot(.) in Linux?
                            
                                The difference between stdout and STDOUT_FILENO
                            
                                How to instruct cron to execute a job every second week?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Linux: When to use scatter/gather IO (readv, writev) vs a large buffer with fread

Tags:

linux

io

Jimm

People also ask

1 Answers

ArjunShankar

Recent Activity

Donate For Us