Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what's the proper buffer size for 'write' function?

Tags:

c

file

linux

io

I am using the low-level I/O function 'write' to write some data to disk in my code (C language on Linux). First, I accumulate the data in a memory buffer, and then I use 'write' to write the data to disk when the buffer is full. So what's the best buffer size for 'write'? According to my tests it isn't the bigger the faster, so I am here to look for the answer.

like image 529
Mickey Shine Avatar asked Mar 12 '12 06:03

Mickey Shine


People also ask

What is a good buffer size?

All that said, there's no “industry standard” buffer size and sample rate, as it's all dependent on your computer's processing power. However, recording at 128 to 256 at a sample rate of 48kHz is acceptable for most home recording on modern-day computers.

What is the max buffer size in C++?

The upper limit for the maximum buffer size is 32768␠bytes (32␠KB).

What is the typical size of the file buffer?

The disk buffer is usually quite small, ranging between 8 and 256 MiB, and the page cache is generally all unused main memory.

Is system call write buffered?

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call.


2 Answers

There is probably some advantage in doing writes which are multiples of the filesystem block size, especially if you are updating a file in place. If you write less than a partial block to a file, the OS has to read the old block, combine in the new contents and then write it out. This doesn't necessarily happen if you rapidly write small pieces in sequence because the updates will be done on buffers in memory which are flushed later. Still, once in a while you could be triggering some inefficiency if you are not filling a block (and a properly aligned one: multiple of block size at an offset which is a multiple of the block size) with each write operation.

This issue of transfer size does not necessarily go away with mmap. If you map a file, and then memcpy some data into the map, you are making a page dirty. That page has to be flushed at some later time: it is indeterminate when. If you make another memcpy which touches the same page, that page could be clean now and you're making it dirty again. So it gets written twice. Page-aligned copies of multiples-of a page size will be the way to go.

like image 143
Kaz Avatar answered Oct 15 '22 01:10

Kaz


You'll want it to be a multiple of the CPU page size, in order to use memory as efficiently as possible.

But ideally you want to use mmap instead, so that you never have to deal with buffers yourself.

like image 22
Ignacio Vazquez-Abrams Avatar answered Oct 15 '22 01:10

Ignacio Vazquez-Abrams