Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File Streaming in Delphi - Optimum Buffer Size

I'm running Delphi RAD Studio XE2. I've been playing around with file streams recently and found some interesting results that have lead me to this question.

What's the optimum buffer size for a TStreamReader in Delphi? For example, I'm loading a 1GB file of 2 million rows in the form doubleTABdoubleTABdouble. If I load it into a TStringListusing the following code, I get dramatically varying results for different buffer sizes. By results I mean processing speeds and RAM usage.

reader := TStreamReader.Create(fileLocation, TEncoding.UTF8, True, NumBytes);
try
  stringList.BeginUpdate;
  try
    stringList.Clear;
    while not reader.EndOfStream do
      stringList.Add(reader.ReadLine);
    finally
      stringList.EndUpdate;
    end;
  finally
    reader.Free;
  end;
end;

The optimum buffer size seems to be between 1024 and 4096. If it's set less than 1024 it seems to slow down linearly and seems to use more RAM. If it's set above 4096 it seems to slow down exponentially.

Why am I seeing these behaviours and how do I determine the optimum buffer size for the task? Additionally, what's the maximum buffer size?

Edit

I ran the following code to extract the run times using the aforementioned file size:

startTime := Now();
myStreamReader := TStreamReader.Create(fileLocation, TEncoding.UTF8, True, numBytes);
myStringList := TStringList.Create;
try
  myStringList.BeginUpdate;
  try
    myStringList.Clear;
    while not myStreamReader.EndOfStream do
      myStringList.Add(myStreamReader.ReadLine);
    finally
      myStringList.EndUpdate;
    end;
  finally
    myStreamReader.Free;
  end;
processTime := Now() - startTime;
myStringList.Free;

Example run times were extracted as:

Buffer Size 32. Done in 69s
Buffer Size 64. Done in 69s
Buffer Size 96. Done in 69s
Buffer Size 128. Done in 70s
Buffer Size 160. Done in 60s
Buffer Size 192. Done in 57s
Buffer Size 224. Done in 52s
Buffer Size 256. Done in 50s
Buffer Size 512. Done in 44s
Buffer Size 768. Done in 40s
Buffer Size 1024. Done in 39s
Buffer Size 1280. Done in 41s
Buffer Size 1536. Done in 44s
Buffer Size 1792. Done in 40s
Buffer Size 2048. Done in 39s
Buffer Size 2304. Done in 41s
Buffer Size 2560. Done in 41s
Buffer Size 2816. Done in 42s
Buffer Size 3072. Done in 43s
Buffer Size 3328. Done in 43s
Buffer Size 3584. Done in 45s
Buffer Size 3840. Done in 44s
Buffer Size 4096. Done in 45s
Buffer Size 4352. Done in 47s
Buffer Size 4608. Done in 46s
Buffer Size 4864. Done in 46s
Buffer Size 5120. Done in 48s
Buffer Size 5376. Done in 49s
Buffer Size 5632. Done in 51s
Buffer Size 5888. Done in 51s
Buffer Size 6144. Done in 52s
Buffer Size 6400. Done in 54s
Buffer Size 6656. Done in 53s
Buffer Size 6912. Done in 55s
Buffer Size 7168. Done in 55s
Buffer Size 7424. Done in 56s
Buffer Size 7680. Done in 57s
Buffer Size 7936. Done in 65s
Buffer Size 8192. Done in 62s
Buffer Size 8448. Done in 63s
Buffer Size 8704. Done in 64s
Buffer Size 8960. Done in 64s
Buffer Size 9216. Done in 66s
Buffer Size 9472. Done in 66s
Buffer Size 9728. Done in 68s
Buffer Size 9984. Done in 68s
Buffer Size 10240. Done in 69s

As for RAM usage, buffer sizes below 256 resulted in a total use of aroud 5GB RAM and buffer sizes above 1024 in a total use of around 3.5GB. For example RAM usage with a 2kb, 4kb and 8kb buffer; please refer to:

this image

like image 842
Trojanian Avatar asked Nov 09 '22 22:11

Trojanian


1 Answers

@Trojanian, the code you mention above is similar to Remy Lebeau's answer in your previous post, TStringList.LoadFromFile - Exceptions with Large Text Files. I too fiddled around with Remy's example which could load larger files but performance for smaller files was roughly half the speed of TStrings.LoadFromFile. My own attempts to toggle the buffer-size didn't boost performance.

Then I found the following code example, Alternative to TStrings.LoadFromFile or TStringList.LoadFromFile, it uses an 128kb buffer and halved the load-time of my large files compared to TStrings.LoadFromFile, i.e. x4 faster than your code above when I use XE3.

like image 100
Lars Avatar answered Nov 15 '22 05:11

Lars