Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's causing this spike in time of string concatenation?

So out of curiosity and idle boredom, I was fooling around with benchmarking Shlemiel the painter's algorithm. I started with a blank string, created another one of 1000 blank spaces, and started adding one to the other, using plain old inefficient string concatenation, timing how long it took each time.

string s1 = "";
string s2 = "";
while (s2.Length < 1000)
{
    s2 += " ";
}

while (true)
{
    Stopwatch sw = Stopwatch.StartNew();
    s1 += s2;
    sw.Stop();

    Console.WriteLine(" {0}| {1}", s1.Length, sw.ElapsedMilliseconds);
}

As expected, the longer the string got, the longer it took to concatenate (it was a much smaller impact than I expected, but that's another question for another day). What was surprising, however, was consistent spikes in the time it took. Every sixth concatenation took roughly two to three times as long as the five preceding concatenations.

 Length     | Time (ms)
 -----------------------
 32250000   | 117
 32251000   | 44
 32252000   | 31
 32253000   | 30
 32254000   | 30
 32255000   | 32
 32256000   | 129
 32257000   | 35
 32258000   | 43
 32259000   | 34
 32260000   | 30
 32261000   | 29
 32262000   | 107
 32263000   | 47
 32264000   | 29
 32265000   | 30
 32266000   | 31
 32267000   | 29
 32268000   | 110
 32269000   | 46
 32270000   | 31
 32271000   | 30
 32272000   | 30
 32273000   | 30
 32274000   | 113

These samples are from once the string started getting significantly large, but the pattern holds from the start. Largely the first thousand or so samples are too small to notice the pattern, but around the 1.8k mark it's recognizable.

My first assumption was that behind the scenes, the characters were being stored in some sort of ArrayList/vector type deal, which doubles in size once it's full, but as I thought about it more, that doesn't fit - if that were the case, spike would come in exponential periods, rather than linear.

So, in short: what the hell is going on here?

like image 411
Mike G Avatar asked Oct 02 '22 12:10

Mike G


1 Answers

Creating strings and discarding them creates garbage. Once you've used a certain amount of memory, garbage collections occurs and pauses your process. Since nothing else is going on in your process and you always make your strings the same length, GC always happens at the same time (every 6th run).

To avoid this effect on your timing, call GC.Collect before starting the timer on each run.

like image 103
Gabe Avatar answered Oct 11 '22 15:10

Gabe