Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FileStream.ReadAsync very slow compared to Read()

I have the following code to loop thru a file and read 1024 bytes at a time. The first iteration uses FileStream.Read() and the second iteration uses FileStream.ReadAsync().

private async void Button_Click(object sender, RoutedEventArgs e)
{
    await Task.Run(() => Test()).ConfigureAwait(false);
}

private async Task Test()
{
    Stopwatch sw = new Stopwatch();
    sw.Start();

    int readSize;
    int blockSize = 1024;
    byte[] data = new byte[blockSize];

    string theFile = @"C:\test.mp4";
    long totalRead = 0;

    using (FileStream fs = new FileStream(theFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
    {

        readSize = fs.Read(data, 0, blockSize);

        while (readSize > 0)
        {
            totalRead += readSize;
            readSize = fs.Read(data, 0, blockSize);
        }
    }

    sw.Stop();
    Console.WriteLine($"Read() Took {sw.ElapsedMilliseconds}ms and totalRead: {totalRead}");

    sw.Reset();
    sw.Start();
    totalRead = 0;
    using (FileStream fs = new FileStream(theFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, (blockSize*2), FileOptions.Asynchronous | FileOptions.SequentialScan))
    {
        readSize = await fs.ReadAsync(data, 0, blockSize).ConfigureAwait(false);

        while (readSize > 0)
        {
            totalRead += readSize;
            readSize = await fs.ReadAsync(data, 0, blockSize).ConfigureAwait(false);
        }
    }

    sw.Stop();
    Console.WriteLine($"ReadAsync() Took {sw.ElapsedMilliseconds}ms and totalRead: {totalRead}");
}

And the result:

Read() Took 162ms and totalRead: 162835040
ReadAsync() Took 15597ms and totalRead: 162835040

The ReadAsync() is about 100 times slower. Am I missing anything? The only thing I can think of is the overhead to create and destroy task using ReadAsync(), but is the overhead that much?

UPDATE:

I've changed the above code to reflect the suggestion by @Cory. There is a slight improvement:

Read() Took 142ms and totalRead: 162835040 
ReadAsync() Took 12288ms and totalRead: 162835040

When I increase the read block size to 1MB as suggested by @Alexandru, the results are much more acceptable:

Read() Took 32ms and totalRead: 162835040
ReadAsync() Took 76ms and totalRead: 162835040

So, it hinted to me that it is indeed the overhead of the number of tasks which causes the slowness. But, if the creation and destroy of task only takes merely 100µs, things still don't really adds up for the slowness with a small block size.

like image 295
SimonSays Avatar asked Sep 06 '16 15:09

SimonSays


2 Answers

Stick with big buffers if you're doing async and make sure to turn on async mode in the FileStream constructor, and you should be okay. Async methods that you await like this will trap in and out of the current thread (mind you the current thread is the UI thread in your case, which can be lagged by any other async method facilitating the same in and out thread trapping) and so there will be some overhead involved in this process if you have a large number of calls (imagine calling a new thread constructor and awaiting for it to finish about 100K times, and especially if you're dealing with a UI app where you need to wait for the UI thread to be free in order to trap back into it once the async function completes). So, to reduce these calls, we simply read in larger increments of data and focus the application on reading more data at a time by increasing the buffer size. Make sure to test this code in Release mode so all of the compiler optimizations are available to us and also such that the debugger does not slow us down:

class Program
{
    static void Main(string[] args)
    {
        DoStuff();
        Console.ReadLine();
    }

    public static async void DoStuff()
    {
        var filename = @"C:\Example.txt";

        var sw = new Stopwatch();
        sw.Start();
        ReadAllFile(filename);
        sw.Stop();

        Console.WriteLine("Sync: " + sw.Elapsed);

        sw.Restart();
        await ReadAllFileAsync(filename);
        sw.Stop();

        Console.WriteLine("Async: " + sw.Elapsed);
    }

    static void ReadAllFile(string filename)
    {
        byte[] buffer = new byte[131072];
        using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, buffer.Length, false))
            while (true)
                if (file.Read(buffer, 0, buffer.Length) <= 0)
                    break;
    }

    static async Task ReadAllFileAsync(string filename)
    {
        byte[] buffer = new byte[131072];
        using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, buffer.Length, true))
            while (true)
                if ((await file.ReadAsync(buffer, 0, buffer.Length)) <= 0)
                    break;
    }
}

Results:

Sync: 00:00:00.3092809

Async: 00:00:00.5541262

Pretty negligible...the file is about 1 GB.

Let's say I go even bigger, a 1 MB buffer, AKA new byte[1048576] (come on man, everyone has 1 MB of RAM these days):

Sync: 00:00:00.2925763

Async: 00:00:00.3402034

Then its just a few hundredths of a second difference. If you blink, you'll miss it.

like image 83
Alexandru Avatar answered Sep 29 '22 08:09

Alexandru


Your method signature suggests you're doing this from an WPF app. While the blocking code will take up the UI thread during this time, the async code will be forced to go through the UI message queue every time an asynchronous operation completes, slowing it down and competing with any UI messages. You should try removing it from the UI thread like so:

void Button_Click(object sender, RoutedEventArgs e)
{
    Task.Run(() => Button_Click_Impl());
}

async Task Button_Click_Impl()
{
    // put code here.
}

Next, open the file in async mode. If you don't do this, async is emulated and will go much slower:

new FileStream(theFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 4096,
               FileOptions.Asynchronous | FileOptions.SequentialScan)

Finally, you may also be able to extract some small performance using ConfigureAwait(false) to avoid moving between threads:

readSize = await fs.ReadAsync(data, 0, 1024).ConfigureAwait(false);
like image 31
Cory Nelson Avatar answered Sep 29 '22 07:09

Cory Nelson