Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is str = str.Replace().Replace(); faster than str = str.Replace(); str = str.Replace()?

I was doing a local test to compare Replace operations performance from String and StringBuilder in C# but for String I was using the following code:

String str = "String to be tested. String to be tested. String to be tested."
str = str.Replace("i", "in");
str = str.Replace("to", "ott");
str = str.Replace("St", "Tsr");
str = str.Replace(".", "\n");
str = str.Replace("be", "or be");
str = str.Replace("al", "xd");

but then, after noticing that String.Replace() was faster than StringBuilder.Replace() I proceeded to test the following code against the one above:

String str = "String to be tested. String to be tested. String to be tested."
str = str.Replace("i", "in").Replace("to", "ott").Replace("St", "Tsr").Replace(".", "\n").Replace("be", "or be").Replace("al", "xd");

And this last one turned out to be around 10% to 15% times faster, any ideas on why is it faster? Is assigning a value to the same variable that expensive?

like image 815
diegobe. Avatar asked Dec 03 '22 19:12

diegobe.


2 Answers

I've made this benchmark:

namespace StringReplace
{
    using BenchmarkDotNet.Attributes;
    using BenchmarkDotNet.Running;

    public class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<Program>();
        }

        private String str = "String to be tested. String to be tested. String to be tested.";

        [Benchmark]
        public string Test1()
        {
            var a = str;
            a = a.Replace("i", "in");
            a = a.Replace("to", "ott");
            a = a.Replace("St", "Tsr");
            a = a.Replace(".", "\n");
            a = a.Replace("be", "or be");
            a = a.Replace("al", "xd");

            return a;
        }

        [Benchmark]
        public string Test2()
        {
            var a = str;
            a = a.Replace("i", "in").Replace("to", "ott").Replace("St", "Tsr").Replace(".", "\n").Replace("be", "or be").Replace("al", "xd");

            return a;
        }
    }
}

Results:

BenchmarkDotNet=v0.10.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7-7700 CPU 3.60GHz, ProcessorCount=8
Frequency=3515629 Hz, Resolution=284.4441 ns, Timer=TSC
Host Runtime=Clr 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.7.2600.0
Job Runtime(s):
    Clr 4.0.30319.42000, Arch=32-bit RELEASE


 Method |      Mean |    StdDev |    Median |
------- |---------- |---------- |---------- |
  Test1 | 1.3768 us | 0.0354 us | 1.3704 us |
  Test2 | 1.3941 us | 0.0325 us | 1.3778 us |

As you see, results are the same in Release mode. So, I think can be small difference in debug mode because of excess assignment of variable. But in release mode compiler can optimize it.

like image 97
Backs Avatar answered Dec 31 '22 14:12

Backs


Short Answer

It looks like you're compiling in a Debug configuration. Because the compiler needs to ensure each statement of source code can have a breakpoint set on it, the excerpt that assigns to the local many times is less efficient.

If you compile in a Release configuration, which optimizes code generation at the expense of not letting you set breakpoints, both excerpts compile to the same intermediate code and thus should have the same performance.

Note that whether you compile in a Debug or Release configuration isn't necessarily related to whether you start the app from Visual Studio with a debugger (F5) or not (Ctrl + F5). For more details, see my answer here.

Long Answer

C# compiles down to .NET intermediate language (IL, or MSIL or CIL). There's a tool that ships with the .NET SDK, the IL Disassembler, which can show us this intermediate language to better understand the difference. Note that the .NET runtime (VES) is a stack machine - instead of registers, IL operates on an "operand stack" where on which values are pushed and pulled. The nature isn't too important for this question, but know that the evaluation stack is the place where temporary values are stored.

Disassembling the first excerpt, which I compiled without setting the "optimize code" option (i.e., I compiled using a Debug configuration), shows code like this:

  .locals init ([0] string str)
  IL_0000:  nop
  IL_0001:  ldstr      "String to be tested. String to be tested. String t" + "o be tested."
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  ldstr      "i"
  IL_000d:  ldstr      "in"
  IL_0012:  callvirt   instance string [mscorlib]System.String::Replace(string, string)
  IL_0017:  stloc.0
  IL_0018:  ldloc.0
  IL_0019:  ldstr      "to"
  IL_001e:  ldstr      "ott"
  IL_0023:  callvirt   instance string [mscorlib]System.String::Replace(string, string)

The method has one local variable, str. In brief, the excerpt:

  1. Creates the "String to be tested..." string on the evaluation stack (ldstr).
  2. Stores the string into the local (stloc.0), resulting in an empty evaluation stack.
  3. Loads that value back onto the stack from the local (ldloc.0).
  4. Calls Replace on the loaded value with two other strings, "i" and "in" (the two ldstr and the callvirt), resulting in an evaluation stack with only the resulting string.
  5. Stores the result back in the local (stloc.0), resulting in an empty evaluation stack.
  6. Loads that value back from the local (ldloc.0).
  7. Calls Replace on the loaded value with two other strings, "to" and "ott" (the two ldstr and the callvirt).

And so on and so forth.

Compare to the second excerpt, also compiled without "optimized code":

  .locals init ([0] string str)
  IL_0000:  nop
  IL_0001:  ldstr      "String to be tested. String to be tested. String t" + "o be tested."
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  ldstr      "i"
  IL_000d:  ldstr      "in"
  IL_0012:  callvirt   instance string [mscorlib]System.String::Replace(string, string)
  IL_0017:  ldstr      "to"
  IL_001c:  ldstr      "ott"
  IL_0021:  callvirt   instance string [mscorlib]System.String::Replace(string, string)

After step 4, the evaluation stack has the result of the first Replace call on it. Because the C# code in this case doesn't assign this intermediate value to the str variable, the IL can avoid storing and re-loading the value, and just re-use the result that's already on the evaluation stack. This skips steps 5 and 6, leading to slightly more performant code.

But wait, surely the compiler knows these excerpts are equivalent, right? Why doesn't it always produce the second, more efficient, set of IL instructions? Because I compiled without optimizations. The compiler thus assumes that I need to be able to set a breakpoint on each C# statement. At a breakpoint, the locals need to be in a consistent state, and the evaluation stack needs to be empty. That's why the first excerpt has steps 5 and 6 - so that the debugger can stop on a breakpoint between those steps, and I'll see that the str local has the value I would expect on that line.

If I compile these excerpts with optimizations on (e.g., I compiled using a Release configuration), then indeed the compiler produces the same code for each:

  // no .locals directive
  IL_0000:  ldstr      "String to be tested. String to be tested. String t" + "o be tested."
  IL_0005:  ldstr      "i"
  IL_000a:  ldstr      "in"
  IL_000f:  callvirt   instance string [mscorlib]System.String::Replace(string,strin g)
  IL_0014:  ldstr      "to"
  IL_0019:  ldstr      "ott"
  IL_001e:  callvirt   instance string [mscorlib]System.String::Replace(string, string)

Now that the compiler knows I won't be able to set breakpoints, it can forgo using a local at all, and have the entire set of operations just occur on the evaluation stack. As a result, it can skip steps 2, 3, 5, and 6, leading to further optimized code.

like image 23
Joe Sewell Avatar answered Dec 31 '22 14:12

Joe Sewell