Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's so special about 4kb for a buffer length?

Tags:

.net

io

buffer

When performing file IO in .NET, it seems that 95% of the examples that I see use a 4096 byte buffer. What's so special about 4kb for a buffer length? Or is it just a convention like using i for the index in a for loop?

like image 909
sheikhjabootie Avatar asked Jul 05 '11 05:07

sheikhjabootie


2 Answers

That is because 4K is the default cluster size for for disks upto 16TB. So when picking a buffer size it makes sense to allocate the buffer in multiples of the cluster size.

A cluster is the smallest unit of allocation for a file, so if a file contains only 1 byte it will consume 4K of physical disk space. And a file of 5K will result in a 8K allocation.


Update: Added a code sample for getting the cluster size of a drive
using System;
using System.Runtime.InteropServices;

class Program
{
  [DllImport("kernel32", SetLastError=true)]
  [return: MarshalAs(UnmanagedType.Bool)]
  static extern bool GetDiskFreeSpace(
    string rootPathName,
    out int sectorsPerCluster,
    out int bytesPerSector,
    out int numberOfFreeClusters,
    out int totalNumberOfClusters);

  static void Main(string[] args)
  {
    int sectorsPerCluster;
    int bytesPerSector;
    int numberOfFreeClusters;
    int totalNumberOfClusters;

    if (GetDiskFreeSpace("C:\\", 
          out sectorsPerCluster, 
          out bytesPerSector, 
          out numberOfFreeClusters, 
          out totalNumberOfClusters))
    {        
      Console.WriteLine("Cluster size = {0} bytes", 
        sectorsPerCluster * bytesPerSector);
    }
    else
    {
      Console.WriteLine("GetDiskFreeSpace Failed: {0:x}", 
        Marshal.GetLastWin32Error());
    }

    Console.ReadKey();
  }
}
like image 58
Chris Taylor Avatar answered Oct 12 '22 23:10

Chris Taylor


A few factors:

  • More often than not, 4K is the cluster size on a disk drive
  • 4K is the most common page size on Windows, so the OS can memory map files in 4K chunks
  • A 4K page can often be transferred from drive to OS to User Process without being copied
  • Windows caches files in RAM using 4K buffers.

Most importantly over the years a lot of people have used 4K as their buffer lengths due to the above, therefore a lot of IO and OS code is optimised for 4K buffers!

like image 31
Ian Ringrose Avatar answered Oct 13 '22 00:10

Ian Ringrose