So out of curiosity and idle boredom, I was fooling around with benchmarking Shlemiel the painter's algorithm. I started with a blank string, created another one of 1000 blank spaces, and started adding one to the other, using plain old inefficient string concatenation, timing how long it took each time.
string s1 = "";
string s2 = "";
while (s2.Length < 1000)
{
s2 += " ";
}
while (true)
{
Stopwatch sw = Stopwatch.StartNew();
s1 += s2;
sw.Stop();
Console.WriteLine(" {0}| {1}", s1.Length, sw.ElapsedMilliseconds);
}
As expected, the longer the string got, the longer it took to concatenate (it was a much smaller impact than I expected, but that's another question for another day). What was surprising, however, was consistent spikes in the time it took. Every sixth concatenation took roughly two to three times as long as the five preceding concatenations.
Length | Time (ms)
-----------------------
32250000 | 117
32251000 | 44
32252000 | 31
32253000 | 30
32254000 | 30
32255000 | 32
32256000 | 129
32257000 | 35
32258000 | 43
32259000 | 34
32260000 | 30
32261000 | 29
32262000 | 107
32263000 | 47
32264000 | 29
32265000 | 30
32266000 | 31
32267000 | 29
32268000 | 110
32269000 | 46
32270000 | 31
32271000 | 30
32272000 | 30
32273000 | 30
32274000 | 113
These samples are from once the string started getting significantly large, but the pattern holds from the start. Largely the first thousand or so samples are too small to notice the pattern, but around the 1.8k mark it's recognizable.
My first assumption was that behind the scenes, the characters were being stored in some sort of ArrayList/vector type deal, which doubles in size once it's full, but as I thought about it more, that doesn't fit - if that were the case, spike would come in exponential periods, rather than linear.
So, in short: what the hell is going on here?
Creating strings and discarding them creates garbage. Once you've used a certain amount of memory, garbage collections occurs and pauses your process. Since nothing else is going on in your process and you always make your strings the same length, GC always happens at the same time (every 6th run).
To avoid this effect on your timing, call GC.Collect before starting the timer on each run.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With