Very brief question. I have a randomly sorted large string array (100K+ entries) where I want to find the first occurance of a desired string. I have two solutions.
From having read what I can my guess is that the 'for loop' is going to currently give slightly better performance (but this margin could always change), but I also find the linq version much more readable. On balance which method is generally considered current best coding practice and why?
string matchString = "dsf897sdf78"; int matchIndex = -1; for(int i=0; i<array.length; i++) { if(array[i]==matchString) { matchIndex = i; break; } }
or
int matchIndex = array.Select((r, i) => new { value = r, index = i }) .Where(t => t.value == matchString) .Select(s => s.index).First();
Most of the times, LINQ will be a bit slower because it introduces overhead. Do not use LINQ if you care much about performance. Use LINQ because you want shorter better readable and maintainable code. So your experience is that LINQ is faster and makes code harder to read and to maintain?
Yes, it's slower.
The forloop is faster than the foreach loop if the array must only be accessed once per iteration.
The best practice depends on what you need:
LINQ really does slow things down with all the indirection. Don't worry about it as 99% of your code does not impact end user performance.
I started with C++ and really learnt how to optimize a piece of code. LINQ is not suited to get the most out of your CPU. So if you measure a LINQ query to be a problem just ditch it. But only then.
For your code sample I'd estimate a 3x slowdown. The allocations (and subsequent GC!) and indirections through the lambdas really hurt.
Slightly better performance? A loop will give SIGNIFICANTLY better performance!
Consider the code below. On my system for a RELEASE (not debug) build, it gives:
Found via loop at index 999999 in 00:00:00.2782047 Found via linq at index 999999 in 00:00:02.5864703 Loop was 9.29700432810805 times faster than linq.
The code is deliberately set up so that the item to be found is right at the end. If it was right at the start, things would be quite different.
using System; using System.Collections.Generic; using System.Diagnostics; using System.Linq; namespace Demo { public static class Program { private static void Main(string[] args) { string[] a = new string[1000000]; for (int i = 0; i < a.Length; ++i) { a[i] = "Won't be found"; } string matchString = "Will be found"; a[a.Length - 1] = "Will be found"; const int COUNT = 100; var sw = Stopwatch.StartNew(); int matchIndex = -1; for (int outer = 0; outer < COUNT; ++outer) { for (int i = 0; i < a.Length; i++) { if (a[i] == matchString) { matchIndex = i; break; } } } sw.Stop(); Console.WriteLine("Found via loop at index " + matchIndex + " in " + sw.Elapsed); double loopTime = sw.Elapsed.TotalSeconds; sw.Restart(); for (int outer = 0; outer < COUNT; ++outer) { matchIndex = a.Select((r, i) => new { value = r, index = i }) .Where(t => t.value == matchString) .Select(s => s.index).First(); } sw.Stop(); Console.WriteLine("Found via linq at index " + matchIndex + " in " + sw.Elapsed); double linqTime = sw.Elapsed.TotalSeconds; Console.WriteLine("Loop was {0} times faster than linq.", linqTime/loopTime); } } }
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With