Preamble: I was changing a bit of code (manual Max search in an array) to some Linq.Max()
super sexy written line, which brings me to questions about performance (I often deal with big arrays). So I made a little program to test, because I only trust what I see and got this results:
The size is now of 1 elements
With the loop it took: 00:00:00.0000015
With Linq it took: 00:00:00.0000288
The loop is faster: 94,79%
-----------------------------------------
The size is now of 10 elements
With the loop it took: 00:00:00
With Linq it took: 00:00:00.0000007
The loop is faster: 100,00%
-----------------------------------------
The size is now of 100 elements
With the loop it took: 00:00:00
With Linq it took: 00:00:00.0000011
The loop is faster: 100,00%
-----------------------------------------
The size is now of 1 000 elements
With the loop it took: 00:00:00.0000003
With Linq it took: 00:00:00.0000078
The loop is faster: 96,15%
-----------------------------------------
The size is now of 10 000 elements
With the loop it took: 00:00:00.0000063
With Linq it took: 00:00:00.0000765
The loop is faster: 91,76%
-----------------------------------------
The size is now of 100 000 elements
With the loop it took: 00:00:00.0000714
With Linq it took: 00:00:00.0007602
The loop is faster: 90,61%
-----------------------------------------
The size is now of 1 000 000 elements
With the loop it took: 00:00:00.0007669
With Linq it took: 00:00:00.0081737
The loop is faster: 90,62%
-----------------------------------------
The size is now of 10 000 000 elements
With the loop it took: 00:00:00.0070811
With Linq it took: 00:00:00.0754348
The loop is faster: 90,61%
-----------------------------------------
The size is now of 100 000 000 elements
With the loop it took: 00:00:00.0788133
With Linq it took: 00:00:00.7758791
The loop is faster: 89,84%
In short, Linq is almost 10 times slower, which bothered me a bit, so I took a look to the implementation of Max()
:
public static int Max(this IEnumerable<int> source) {
if (source == null) throw Error.ArgumentNull("source");
int value = 0;
bool hasValue = false;
foreach (int x in source) {
if (hasValue) {
if (x > value) value = x;
}
else {
value = x;
hasValue = true;
}
}
if (hasValue) return value;
throw Error.NoElements();
}
As the title already ask, what in this implementation make it 10 times slower? (And it's not about the ForEach
, I've checked)
Of course, I test in Release mode.
Here's my test code (without output):
//----------------
private int[] arrDoubles;
//----------------
Stopwatch watch = new Stopwatch();
//Stop a 100 Millions to avoid memory overflow on my laptop
for (int i = 1; i <= 100000000; i = i * 10)
{
fillArray(i);
watch.Restart();
int max = Int32.MinValue; // Reset
for (int j = 0; j < arrDoubles.Length; j++)
{
max = Math.Max(arrDoubles[j], max);
}
watch.Stop();
TimeSpan loopSpan = watch.Elapsed;
watch.Restart();
max = Int32.MinValue; // Reset
max = arrDoubles.Max();
watch.Stop();
TimeSpan linqSpan = watch.Elapsed;
}
//-------------------------------------------
private void fillArray(int nbValues)
{
int Min = Int32.MinValue;
int Max = Int32.MaxValue;
Random randNum = new Random();
arrDoubles = Enumerable.Repeat(0, nbValues).Select(i => randNum.Next(Min, Max)).ToArray();
}
This is likely to be happening because accessing an array via IEnumerable<>
is much slower than accessing from it's actual array type (even when using foreach
).
The following code demonstrates this. Note how the code in max1()
and max2()
is identical; the only difference is the type of the array
parameter. Both methods are passed the same object during testing.
Try running it from a RELEASE build (and not under the debugger, which will enable debug code even for a release build):
using System;
using System.Collections.Generic;
using System.Diagnostics;
namespace Demo
{
public class Program
{
private static void Main(string[] args)
{
var array = new int[100000000];
var sw = new Stopwatch();
for (int trial = 0; trial < 8; ++trial)
{
sw.Restart();
for (int i = 0; i < 10; ++i)
max1(array);
var elapsed1 = sw.Elapsed;
Console.WriteLine("int[] took " + elapsed1);
sw.Restart();
for (int i = 0; i < 10; ++i)
max2(array);
var elapsed2 = sw.Elapsed;
Console.WriteLine("IEnumerable<int> took " + elapsed2);
Console.WriteLine("\nFirst method was {0} times faster.\n", elapsed2.TotalSeconds / elapsed1.TotalSeconds);
}
}
private static int max1(int[] array)
{
int result = int.MinValue;
foreach (int n in array)
if (n > result)
result = n;
return result;
}
private static int max2(IEnumerable<int> array)
{
int result = int.MinValue;
foreach (int n in array)
if (n > result)
result = n;
return result;
}
}
}
On my PC, the int[]
version is around 10 times faster than the IEnumerable<int>
version.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With