Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this System.IO.Pipelines code much slower than Stream-based code?

I've written a little parsing program to compare the older System.IO.Stream and the newer System.IO.Pipelines in .NET Core. I'm expecting the pipelines code to be of equivalent speed or faster. However, it's about 40% slower.

The program is simple: it searches for a keyword in a 100Mb text file, and returns the line number of the keyword. Here is the Stream version:

public static async Task<int> GetLineNumberUsingStreamAsync(
    string file,
    string searchWord)
{
    using var fileStream = File.OpenRead(file);
    using var lines = new StreamReader(fileStream, bufferSize: 4096);

    int lineNumber = 1;
    // ReadLineAsync returns null on stream end, exiting the loop
    while (await lines.ReadLineAsync() is string line)
    {
        if (line.Contains(searchWord))
            return lineNumber;

        lineNumber++;
    }
    return -1;
}

I would expect the above stream code to be slower than the below pipelines code, because the stream code is encoding the bytes to a string in the StreamReader. The pipelines code avoids this by operating on bytes:

public static async Task<int> GetLineNumberUsingPipeAsync(string file, string searchWord)
{
    var searchBytes = Encoding.UTF8.GetBytes(searchWord);
    using var fileStream = File.OpenRead(file);
    var pipe = PipeReader.Create(fileStream, new StreamPipeReaderOptions(bufferSize: 4096));

    var lineNumber = 1;
    while (true)
    {
        var readResult = await pipe.ReadAsync().ConfigureAwait(false);
        var buffer = readResult.Buffer;

        if(TryFindBytesInBuffer(ref buffer, searchBytes, ref lineNumber))
        {
            return lineNumber;
        }

        pipe.AdvanceTo(buffer.End);

        if (readResult.IsCompleted) break;
    }

    await pipe.CompleteAsync();

    return -1;
}

Here are the associated helper methods:

/// <summary>
/// Look for `searchBytes` in `buffer`, incrementing the `lineNumber` every
/// time we find a new line.
/// </summary>
/// <returns>true if we found the searchBytes, false otherwise</returns>
static bool TryFindBytesInBuffer(
    ref ReadOnlySequence<byte> buffer,
    in ReadOnlySpan<byte> searchBytes,
    ref int lineNumber)
{
    var bufferReader = new SequenceReader<byte>(buffer);
    while (TryReadLine(ref bufferReader, out var line))
    {
        if (ContainsBytes(ref line, searchBytes))
            return true;

        lineNumber++;
    }
    return false;
}

static bool TryReadLine(
    ref SequenceReader<byte> bufferReader,
    out ReadOnlySequence<byte> line)
{
    var foundNewLine = bufferReader.TryReadTo(out line, (byte)'\n', advancePastDelimiter: true);
    if (!foundNewLine)
    {
        line = default;
        return false;
    }

    return true;
}

static bool ContainsBytes(
    ref ReadOnlySequence<byte> line,
    in ReadOnlySpan<byte> searchBytes)
{
    return new SequenceReader<byte>(line).TryReadTo(out var _, searchBytes);
}

I'm using SequenceReader<byte> above because my understanding is that it's more intelligent/faster than ReadOnlySequence<byte>; it has a fast path for when it can operate on a single Span<byte>.

Here are the benchmark results (.NET Core 3.1). Full code and BenchmarkDotNet results are available in this repo.

  • GetLineNumberWithStreamAsync - 435.6 ms while allocating 366.19 MB
  • GetLineNumberUsingPipeAsync - 619.8 ms while allocating 9.28 MB

Am I doing something wrong in the pipelines code?

Update: Evk has answered the question. After applying his fix, here are the new benchmark numbers:

  • GetLineNumberWithStreamAsync - 452.2 ms while allocating 366.19 MB
  • GetLineNumberWithPipeAsync - 203.8 ms while allocated 9.28 MB
like image 857
Will Avatar asked Oct 09 '20 16:10

Will


People also ask

What is System IO pipelines?

System. IO. Pipelines is a library that is designed to make it easier to do high-performance I/O in . NET. It's a library targeting .

What is .NET pipeline?

The ASP.NET Core request processing pipeline consists of a sequence of middleware components that are going to be called one after the other. Each middleware component can perform some operations before and after invoking the next component using the next method.

What is a pipeline C#?

Pipeline design pattern is used when we have to perform multiple tasks in sequence on data/item. An item flows through pipeline, going through multiple stages where each stage performs some task on it.


2 Answers

I believe the reason is implementaiton of SequenceReader.TryReadTo. Here is the source code of this method. It uses pretty straightforward algorithm (read to the match of first byte, then check if all subsequent bytes after that match, if not - advance 1 byte forward and repeat), and note how there are quite some methods in this implementation called "slow" (IsNextSlow, TryReadToSlow and so on), so under at least certain circumstances and in certain cases it falls back to some slow path. It also has to deal with the fact sequence might contain multiple segments, and with maintaining the position.

In your case you can avoid using SequenceReader specifically for searching the match (but leave it for actually reading lines), for example with this minor changes (this overload of TryReadTo is also more efficient in this case):

private static bool TryReadLine(ref SequenceReader<byte> bufferReader, out ReadOnlySpan<byte> line) {
    // note that both `match` and `line` are now `ReadOnlySpan` and not `ReadOnlySequence`
    var foundNewLine = bufferReader.TryReadTo(out ReadOnlySpan<byte> match, (byte) '\n', advancePastDelimiter: true);

    if (!foundNewLine) {
        line = default;
        return false;
    }

    line = match;
    return true;
}

Then:

private static bool ContainsBytes(ref ReadOnlySpan<byte> line, in ReadOnlySpan<byte> searchBytes) {
    // line is now `ReadOnlySpan` so we can use efficient `IndexOf` method
    return line.IndexOf(searchBytes) >= 0;
}

This will make your pipes code run faster than streams one.

like image 157
Evk Avatar answered Sep 19 '22 11:09

Evk


This is perhaps not exactly the explanation you look for but I hope it give some insight:

Having a glance over the two approaches you have there, it shows in the 2nd solution is computationally more complex than the other, by having two nested loops.

Digging deeper using code profiling showcases that the 2nd one (GetLineNumberUsingPipeAsync) is almost 21.5 % more CPU intensive than the one uses the Stream (please check the screenshots, ) And it is close enough to the benchmark result I got:

  • Solution#1: 683.7 ms, 365.84 MB

  • Solution#2: 777.5 ms, 9.08 MB

enter image description here

enter image description here

enter image description here

like image 24
Amin Mirzapour Avatar answered Sep 22 '22 11:09

Amin Mirzapour