Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String operation optimisation in C#

The following C# code takes 5 minutes to run:

int i = 1;
string fraction = "";
while (fraction.Length < 1000000)
{
    fraction += i.ToString();
    i++;
}

"Optimising it" like this causes it to run in 1.5 seconds:

int i = 1;
string fraction = "";
while (fraction.Length < 1000000)
{
    // concatenating strings is much faster for small strings
    string tmp = "";
    for (int j = 0; j < 1000; j++)
    {
        tmp += i.ToString();
        i++;
    }
    fraction += tmp;
}

EDIT: Some people suggested using StringBuilder, which is an excellent suggestion also, and this comes out at 0.06s:

int i = 1;
StringBuilder fraction = new StringBuilder();
while (fraction.Length < 1000000)
{
    fraction.Append(i);
    i++;
}

Playing around to find the optimum value of j is a topic for another time, but why exactly does this non-obvious optimisation work so well? Also, on a related topic, I've heard it said that you should never use the + operator with strings, in favour of string.Format(), is this true?

like image 614
Matthew Scharley Avatar asked Nov 11 '08 23:11

Matthew Scharley


People also ask

What is Optimisation in C?

Optimization is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed. In optimization, high-level general programming constructs are replaced by very efficient low-level programming codes.

Are C strings faster?

C-strings are usually faster, because they do not call malloc/new.

How many string operations does C have?

There are two ways to declare strings in C: The following example will create a string as "Scaler" where the last character must always be a null character.


2 Answers

I don't get your results at all. On my box StringBuilder wins hands down. Could you post your full test program? Here's mine, with three variants - your string concatenation optimisation, the "simple" StringBuilder one, and StringBuilder with an initial capacity. I've increased the limit as it was going too fast on my box to be usefully measurable.

using System;
using System.Diagnostics;
using System.Text;

public class Test
{
    const int Limit = 4000000;

    static void Main()
    {
        Time(Concatenation, "Concat");
        Time(SimpleStringBuilder, "StringBuilder as in post");
        Time(SimpleStringBuilderNoToString, "StringBuilder calling Append(i)");
        Time(CapacityStringBuilder, "StringBuilder with appropriate capacity");
    }

    static void Time(Action action, string name)
    {
        Stopwatch sw = Stopwatch.StartNew();
        action();
        sw.Stop();
        Console.WriteLine("{0}: {1}ms", name, sw.ElapsedMilliseconds);
        GC.Collect();
        GC.WaitForPendingFinalizers();
    }

    static void Concatenation()
    {
        int i = 1;
        string fraction = "";
        while (fraction.Length < Limit)
        {
            // concatenating strings is much faster for small strings
            string tmp = "";
            for (int j = 0; j < 1000; j++)
            {
                tmp += i.ToString();
                i++;
            }
            fraction += tmp;            
        }
    }

    static void SimpleStringBuilder()
    {
        int i = 1;
        StringBuilder fraction = new StringBuilder();
        while (fraction.Length < Limit)
        {
            fraction.Append(i.ToString());
            i++;
        }
    }

    static void SimpleStringBuilderNoToString()
    {
        int i = 1;
        StringBuilder fraction = new StringBuilder();
        while (fraction.Length < Limit)
        {
            fraction.Append(i);
            i++;
        }
    }

    static void CapacityStringBuilder()
    {
        int i = 1;
        StringBuilder fraction = new StringBuilder(Limit + 10);
        while (fraction.Length < Limit)
        {
            fraction.Append(i);
            i++;
        }
    }
}

And the results:

Concat: 5879ms
StringBuilder as in post: 206ms
StringBuilder calling Append(i): 196ms
StringBuilder with appropriate capacity: 184ms

The reason your concatenation is faster than the very first solution is simple though - you're doing several "cheap" concatenations (where relatively little data is being copied each time) and relatively few "large" concatenations (of the whole string so far). In the original, every step would copy all of the data obtained so far, which is obviously more expensive.

like image 187
Jon Skeet Avatar answered Sep 29 '22 01:09

Jon Skeet


Use StringBuilder for concatenating more than (approximately) 5 strings (results may vary slightly). Also, give the StringBuilder's constructor a hint on the expected maximum size.

[Update]: just commenting on your edit to the question. You can also increase StringBuilder's performance if you have an approximate (or exact) idea of the final size of the concatenated strings, because this will reduce the number of memory allocations it has to perform:

// e.g. Initialise to 10MB
StringBuilder fraction = new StringBuilder(10000000);
like image 24
Mitch Wheat Avatar answered Sep 29 '22 03:09

Mitch Wheat