Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are buffer sizes chosen?

Tags:

c

linux

I'm reading TLPI (The Linux Programming Interface), and they seem to use 1024 as a standard buffer size for fileio operations. I'm wonder why this size is chosen. Is there a "best" file size?

To elaborate and hopefully get some further insight: in which situations would using 512 or 2048 bytes break something? I assume if 1024 is safe, 512 is safe as well, but just slower because you'll have to move data with twice as many steps. So, if my thinking is correct, the larger the buffer size, the faster the operation, but this also raises potential for failure?

like image 287
Evan Rose Avatar asked Jun 28 '14 04:06

Evan Rose


2 Answers

Although I've usually seen philosophical questions like yours voted down and closed, I still like them and the discussion that they encourage. (I was trying to fit everything in the comment, but it wouldn't quite work.)

The short answer is that reasonably small buffers are safe, and they are typically chosen to be the least common denominator of the structure that is being operated on (such as your example, 1,024, due to the fact that most file systems allocate blocks in some multiple of one kilobyte, i.e., 1,024 bytes).

The longer answer is that variable buffer sizes (usually larger than the safe default values) play a very deep role in the optimal performance of software as it interacts with hardware and even the type of workload. There is no best size for all hardware and operating system environments. Tuning buffer sizes is a cheap way to make programs work better on your system. As such, it is also used by some dishonest software developers to pretend like their software is better. For example, consider web server software. You could claim your web server software has better performance by tuning all of its buffer sizes, then comparing it to an Apache installation with a default configuration. You can also bottleneck the software externally by using kernel tuning that matches one software's setup but causes another one to use an extra frame for each request, for example.

Tuning in this way is also dangerous because it may make it easier to perform denial-of-service attacks like "slow loris" / resource exhaustion. So once again, smaller is safer, although not necessarily "better" depending on your prioritization of safety versus performance.

like image 84
Joseph Myers Avatar answered Sep 16 '22 11:09

Joseph Myers


Any time Linux code refers to block size, it is almost always 1024 bytes. Linux uses 1024-byte blocks for the buffer cache, etc.

Of course, Linux filesystems often implement alternate block sizes. A standard ext3 filesystem block size is usually 4096 bytes.

like image 29
Mahonri Moriancumer Avatar answered Sep 19 '22 11:09

Mahonri Moriancumer