We have the following code at the very end of a complex image processing algorithm to save the results to a text file. The function’s input is a float[,] p_RangeMap
that represents the output of the image processing:
StringBuilder stringBuilder = new StringBuilder(30 * 1024 * 1024);
stringBuilder.AppendLine("Row" + ms_csvSeparator + p_RangeMap.GetLength(0));
stringBuilder.AppendLine("Col" + ms_csvSeparator + p_RangeMap.GetLength(1));
Stopwatch stopwatch = Stopwatch.StartNew();
for (int y = 0; y < p_RangeMap.GetLength(0); y++)
{
for (int x = 0; x < p_RangeMap.GetLength(1); x++)
{
stringBuilder.Append(p_RangeMap[y, x].ToString(CultureInfo.InvariantCulture));
stringBuilder.Append(ms_csvSeparator);
}
stringBuilder.AppendLine();
}
stopwatch.Stop();
Console.WriteLine("MeasureRunTime: RangemapConvert: " + stopwatch.Elapsed);
By measuring the runtime of these lines for 6000 iterations, we got the following chart:
The 6000 iterations took about 24 hours to run. The horizontal axis represents the iterations, while the vertical axis shows the elapsed time for these lines to run in seconds. The input was exactly the same for each iteration, and the dimensions of the p_RangeMap were 1312 x 3500.
It starts with about two seconds, and after rising up to 13 seconds, it falls back after 900 iterations, and the (approximately) 900 periods form a cycle. As you can see the highest value is around 22 seconds.
Any ideas what could result such a variation of the runtime?
What could cause the periodicity?
It may worth mentioning that other parts of the code show the same behavior, but this part was the easiest to grab out from the source.
UPDATE1:
I've updated the code sample, the StringBuilder is preallocated with a rough estimation of the file size. We also thought about garbage collection, please consider the followings:
900 hundred cycles means around 3.5 hours, and about 16 GB of processed input data (the same picture loaded from file again and again). Not to mention the copies created during the image processing for various reasons. I think GC should trigger way more often.
16 GB comes from: 1312 * 3500 * 4 * 900
This is very interesting. Based upon your update, where you are allocating 16GB, you might consider wrapping up your Float array and your string builder in a class that allows you to reuse the buffers rather than reallocate them on every re-run. (like reuse the same string builder and array each time that whole function is called, so only 1 instance is created per 6,000 runs)
Here's why: Your builder size is about 30MB, so it's allocated in the Large Object Heap. (any time an allocation is over 85,000 bytes, it goes into the LOH). GC rules for LOH are a little different. For one thing, it is only GC'd on a 2nd gen collection (the most expensive kind), so if you are heavily creating and tearing down large allocations, you could be forcing Gen2 collections to happen more often. Also, the LOH is not defragmented on 2nd Gen GC (because it would be too expensive), so you are more likely to cause GC's when you allocate large, because you need a linear 32MB.
I would be very curious to see what your perf is like if you wrap up your float array and your string builder, calling Clear() on your string builder between runs, to empty it without deallocating it, and keeping track of the used size of your float array.
here's an article on LOH: http://msdn.microsoft.com/en-us/magazine/cc534993.aspx
First look:
My guess is a memory issue. The sharp drops in the graph looks like the GC kicking in doing a garbage collection.
Stringbuilder is quite efficient, but I doubt if it was build with this scenario in mind.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With