Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this file copying method slowing down

Tags:

c#

file-io

I'm using code to copy a file from one location to another while generating a checksum on the fly. For small file the code functions properly but for big files for example a 3.8GB file it behaves strangely: After about 1 GB copied it suddenly slows down quite fast and then slows down more and more (fo example before the 1 GB is reached I observed about 2%-4% of the file being copied per second then when the 1 GB is reached it takes about 4-6 seconds per % of the file).

 int bytesRead = 0;
 int bytesInWriteBuffer = 0;
 byte[] readBuffer = new byte[1638400];
 byte[] writeBuffer = new byte[4915200];
 MD5 md5Handler = new MD5CryptoServiceProvider();
 using (FileStream sourceStream = File.Open(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
    md5Handler.TransformBlock(readBuffer, 0, bytesRead, null, 0);
    FileStream destinationStream = File.Create(storageFileName);
    while (bytesRead = sourceStream.Read(readBuffer, 0, readBuffer.Length))
    {
        Buffer.BlockCopy(readBuffer, 0, writeBuffer, bytesInWriteBuffer, bytesRead);
        bytesInWriteBuffer += bytesRead
        if (bytesInWriteBuffer >= 4915200)
        {
             destinationStream.Write(writeBuffer, 0, bytesInWriteBuffer);
             bytesInWriteBuffer = 0;
             Thread.Sleep(50);
        }
    }
}   

As was asked in a comment: There is no memory leak that would have been observable. The memory usage increases at the start of the method and then stays stable (total memory usage on the pc running it including when the mthod is run is in total 56% (for all applications in total running on that pc)). Total memory of the PC is 8 GB.

The application itself is 32 bit (takes up around 300 MB of memory itself) and the used framework is 4.5.

As an update after testing something a comment suggested: When I make the copy and cancel it via token and delete the file (all after the slowing down has started), and immediately begin a second copy process it is as slow as the other one was at the time I cancelled it (so the slowing down starts already before 1 GB there). BUT when I make the 2nd copy after the deletion finished it starts normally and only slows down at 1 GB.

Also Flushing the destination stream makes no difference there.

For the slowing down the copy is about 84MB per second at first and slows down to about 14MB per second at 1 GB.

As part of this question (not sure if better as a comment): Is it possible that this is not a C# related problem but instead "solely" a problem of the caching mechanisms from the OS? (and if so can be something done there)

As suggested I looked for the writecache of the OS and also let a performance monitor run. Results:

  • Different source hard drives and source desktops have the same result and also the same moment of the slow down
  • Write cache in the OS (destination) is disabled
  • Performance monitoring on the server where the destination lies shows nothing significant (write queue length is only once at 4 and once at 2, write time/idle time and also writes/second show nothing that suggests 100% usage of a cache or something else).

Further tests showed the following behaviour:

  • If the copying itself is slowed down by doing a 200 millisecond Thread.Sleep after each write the average copy rate is 30 MB / sec which is constant
  • If I instead put in a delay of 5 seconds (Thread.Sleep) after every 500 MB or 800 MB transferred, the slowing down occurs again and the waiting does not change anything at all.
  • If I change the locations so that the source and destination are on my local hard drive (normally the destination is on a network folder) the rate is constant at 50 MB/s whereas the readtime is 100% and the bottleneck there, the writetime is way below 100%.
  • Network transfer monitoring showed nothing that was unexpected
  • Windows explorer has a transfer rate of 11 MB/s when copying a 3 GB file from the same source to the same destination (thus despite the slowdown happening in total the C# copying method is faster than windows explorer copying)

Further behaviour:

  • According to monitoring things there was a constant stream to the destination drive (thus no speedy first part and a slowdown but the destination received the bytes constantly at the same speed).

As an adding to this: In total the performance for a 3 GB file is about 37 MB/s (84 MB for the first GB and 14 MB for the other GB).

like image 769
Thomas Avatar asked Aug 14 '14 13:08

Thomas


1 Answers

Just a guess but I feel it's worth a try. it may be related to the file system's space allocation algorithm. At first it cannot predict the size of the file. It allocates a space, but after a while (1GB in your case) it reaches the bounds. Then it probably tries to move the neighbor file to make a contiguous storage. Check this out: https://superuser.com/a/274867/301925

In order to make sure, I'd suggest you create a file with initial size as in the following code, and log the time elapsed in every step. (I don't have an environment to try it out, correct it if it contains syntax errors)

int bytesRead = 0;
int bytesInWriteBuffer = 0;
byte[] readBuffer = new byte[1638400];
byte[] writeBuffer = new byte[4915200];
//MD5 md5Handler = new MD5CryptoServiceProvider(); exclude for now
Stopwatch stopwatch = new Stopwatch();
long fileSize = new FileInfo(filePath).Length;
using (FileStream sourceStream = File.Open(filePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
    //md5Handler.TransformBlock(readBuffer, 0, bytesRead, null, 0); exclude it for now
    stopwatch.Start();
    FileStream destinationStream = File.Create(storageFileName);
    stopwatch.Stop();
    Console.WriteLine("Create destination stream: " + stopwatch.ElapsedMilliseconds);

    stopwatch.Restart();
    // trick to give an initial size
    destinationStream.Seek(fileSize - 1, SeekOrigin.Begin);
    destinationStream.WriteByte(0);
    destinationStream.Flush();
    destinationStream.Seek(0, SeekOrigin.Begin);
    stopwatch.Stop();
    Console.WriteLine("Set initial size to destination stream: " + stopwatch.ElapsedMilliseconds);

    while (true)
    {
        stopwatch.Restart();
        bytesRead = sourceStream.Read(readBuffer, 0, readBuffer.Length);
        stopwatch.Stop();
        Console.WriteLine("Read " + bytesRead + " bytes: " + stopwatch.ElapsedMilliseconds);

        if(bytesRead <= 0)
            break;
        Buffer.BlockCopy(readBuffer, 0, writeBuffer, bytesInWriteBuffer, bytesRead);
        bytesInWriteBuffer += bytesRead;
        if (bytesInWriteBuffer >= 4915200)
        {
            stopwatch.Restart();
            destinationStream.Write(writeBuffer, 0, bytesInWriteBuffer);
            stopwatch.Stop();
            Console.WriteLine("Write " + bytesInWriteBuffer + " bytes: " + stopwatch.ElapsedMilliseconds);

            bytesInWriteBuffer = 0;
            //Thread.Sleep(50); exclude it for now
        }
    }
}
like image 184
Feyyaz Avatar answered Oct 05 '22 22:10

Feyyaz