Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my ReadOnlySpan<char> is slower than my string?

Tags:

c#

.net-core

I'm playing with ReadOnlySpan and I'd like to see by myself that it's way faster than using a string but... so far, it's not. I know that I probably made a mistake in my code, but I can't find it.

static int CountCharacterWithoutSpan(string originalString, string sequence)
{
    int count = 0;

    for (int i = 0, length = originalString.Length - sequence.Length; i < length; ++i)
    {
        if (originalString.Substring(i, sequence.Length).Equals(sequence))
        {
            count++;
        }
    }

    return count;
}
        
static int CountCharacterWithSpan(ReadOnlySpan<char> originalString, string sequence)
{
    int count = 0;

    for (int i = 0, length = originalString.Length - sequence.Length; i < length; ++i)
    {
        if (originalString.Slice(i, sequence.Length).SequenceEqual(sequence))
        {
            count++;
        }
    }

    return count;
}

So basically, the goal of this code is to be able to find a string inside another one. The differences between the two is that I use Slice instead of Substring and SequenceEqual instead of Equals. However, when I run and monitor this code using a Stopwatch, CountCharacterWithSpan always take 2 to 3 times more times than CountCharacterWithoutSpan (the string test is about 80K characters).

I think the issue comes from SequenceEquals but that's the only way I found to compare the sliced ReadOnlySpan and a regular string (Equals does not work and == is faster but compare the reference, so the result is not correct)

like image 374
ssougnez Avatar asked Oct 28 '22 10:10

ssougnez


1 Answers

Contrary to what you've said in your question, the span-based version is in fact much faster than the non-span-based one.

As per the suggestion from morten-mertner in the comments, I made a slightly modified version of your second method:

public static int CountCharacterWithSpan(
    ReadOnlySpan<char> originalString, ReadOnlySpan<char> sequence)
{
    int count = 0;

    for (int i = 0, length = originalString.Length - sequence.Length; i < length; ++i)
    {
        if (originalString.Slice(i, sequence.Length).SequenceEqual(sequence))
        {
            count++;
        }
    }

    return count;
}

But as we'll see, this makes no difference. It's about as fast as your original span-based one, and both are much faster than your non-span-based one.

Here's what BenchmarkDotNet report for all three, using an 80K character originalString, and a 20 character sequence running on .NET Core 2.2, with three variations on each. In the 'Random' variations, the sequence is just random text, so it's possible to detect very early on that there's no match. In the 'Match' variations, the sequence is a substring that does actually exist somewhere in the text, but the input is still random, so most of the searches terminate very quickly, but one will be slow. And in the 'MatchAll' cases, the originalString and sequence are all just the same character over and over again, meaning that every comparison will succeed, which means the most amount of comparison work possible. (It will need to compare every character over and over.)

|                      Method |       Mean |      Error |     StdDev |
|---------------------------- |-----------:|-----------:|-----------:|
|   OriginalWithoutSpanRandom | 1,087.1 us | 11.4152 us | 10.6778 us |
|    OriginalWithoutSpanMatch | 1,098.8 us | 26.0405 us | 23.0842 us |
| OriginalWithoutSpanMatchAll | 1,164.3 us | 15.8291 us | 14.8066 us |
|      OriginalWithSpanRandom |   188.8 us |  1.3194 us |  1.2341 us |
|       OriginalWithSpanMatch |   188.3 us |  0.6132 us |  0.5736 us |
|    OriginalWithSpanMatchAll |   224.3 us |  3.0027 us |  2.8087 us |
|      ModifiedWithSpanRandom |   189.0 us |  0.9979 us |  0.9334 us |
|       ModifiedWithSpanMatch |   189.5 us |  1.1694 us |  1.0367 us |
|    ModifiedWithSpanMatchAll |   223.2 us |  1.3251 us |  1.2395 us |

Here are the results changing sequence to be 200 characters:

|                      Method |       Mean |     Error |    StdDev |
|---------------------------- |-----------:|----------:|----------:|
|   OriginalWithoutSpanRandom | 2,432.2 us | 35.777 us | 31.715 us |
|    OriginalWithoutSpanMatch | 2,476.1 us | 42.809 us | 35.747 us |
| OriginalWithoutSpanMatchAll | 2,815.6 us | 22.508 us | 19.953 us |
|      OriginalWithSpanRandom |   190.2 us |  1.531 us |  1.432 us |
|       OriginalWithSpanMatch |   189.8 us |  1.937 us |  1.717 us |
|    OriginalWithSpanMatchAll |   602.3 us |  4.662 us |  4.361 us |
|      ModifiedWithSpanRandom |   190.1 us |  2.200 us |  2.058 us |
|       ModifiedWithSpanMatch |   191.1 us |  2.860 us |  2.675 us |
|    ModifiedWithSpanMatchAll |   599.9 us |  3.696 us |  3.457 us |

And here's how it looks if we change sequence to be 2000 characters:

|                      Method |        Mean |      Error |     StdDev |
|---------------------------- |------------:|-----------:|-----------:|
|   OriginalWithoutSpanRandom | 16,819.9 us | 310.576 us | 290.513 us |
|    OriginalWithoutSpanMatch | 17,148.8 us | 231.140 us | 216.209 us |
| OriginalWithoutSpanMatchAll | 21,817.9 us | 246.378 us | 218.408 us |
|      OriginalWithSpanRandom |    184.2 us |   1.633 us |   1.528 us |
|       OriginalWithSpanMatch |    185.3 us |   1.440 us |   1.347 us |
|    OriginalWithSpanMatchAll |  4,649.7 us |  22.810 us |  20.221 us |
|      ModifiedWithSpanRandom |    185.2 us |   1.198 us |   1.120 us |
|       ModifiedWithSpanMatch |    186.7 us |   2.158 us |   2.019 us |
|    ModifiedWithSpanMatchAll |  4,651.1 us |  25.013 us |  22.173 us |

As you can see I've been unable to reproduce the result you describe in which "CountCharacterWithSpan always take 2 to 3 times more times than CountCharacterWithoutSpan". In these tests, the CountCharacterWithoutSpan is consistently very much slower than either of the ReadOnlySpan<char>-based versions. (There difference between those two is too small to measure though.)

With both of the span-based methods, the amount of work being done in each comparison is significant: you can see substantial differences between the tests where most of the string comparisons can bail out after a character or two and the ones where it has to compare every single character. (There's no meaningful difference between the Random and Match examples though - it seems that the cost difference of having all the comparisons bail out early, and having one bail out early are tiny. That's not surprising, since we're basically looking at one comparison out of 80,000 being expensive and the rest being cheap.

The thing that is absolutely clear here is that the non-span-based version is expensive. It's the call to Substring that kills it. It's particularly bad in the tests where most comparisons fail almost immediately: you allocated a 2,000 character copy of some substring of originalString, and then only look at maybe a handful of characters.

Notice that in the case where we're able to bail early, the performance of the span-based versions is pretty much independent of the length of sequence - roughly 190us in all cases. This is what you'd hope—in cases where we can determine that there's no match very early on it shouldn't really matter how long sequence is, but in your non-span-based version, the length of sequence matters a great deal even in these cases.

How many measurements are you taking in your tests? I'm wondering if perhaps you're just measuring a single run, in which case you're not really measuring how long the code takes to run: you're mostly measuring how long it takes the JIT compiler to compile it.

like image 94
Ian Griffiths Avatar answered Nov 22 '22 23:11

Ian Griffiths