Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient loop in c#

There are a number of different way to accomplish the same simple loop though the items of an object in c#.

This has made me wonder if there is any reason be it performance or ease of use, as to use on over the other. Or is it just down to personal preference.

Take a simple object

var myList = List<MyObject>;  

Lets assume the object is filled and we want to iterate over the items.

Method 1.

foreach(var item in myList)  {    //Do stuff } 

Method 2

myList.Foreach(ml =>  {    //Do stuff }); 

Method 3

while (myList.MoveNext())  {   //Do stuff } 

Method 4

for (int i = 0; i < myList.Count; i++) {   //Do stuff    } 

What I was wondering is do each of these compiled down to the same thing? is there a clear performance advantage for using one over the others?

or is this just down to personal preference when coding?

Have I missed any?

like image 572
TheAlbear Avatar asked Mar 06 '13 12:03

TheAlbear


People also ask

Which loop is more efficient in C?

do-while is fastest to run the first iteration as there is no checking of a condition at the start.

Which is most efficient loop?

The fastest loop is a for loop, both with and without caching length delivering really similar performance.

Which loop is fastest?

For loop (forward and reverse) The traditional for loop is the fastest, so you should always use that right? Not so fast - performance is not the only thing that matters. Code Readability is usually more important, so default to the style that fits your application.

Which is more efficient while loop or for loop?

Generally, the for loop can be more efficient than the while loop, but not always. The idea of the While loop is: While something is the case, do the following block of code. In this code, we have defined a variable name condition, and condition starts at a value of 1.


1 Answers

The answer the majority of the time is it does not matter. The number of items in the loop (even what one might consider a "large" number of items, say in the thousands) isn't going to have an impact on the code.

Of course, if you identify this as a bottleneck in your situation, by all means, address it, but you have to identify the bottleneck first.

That said, there are a number of things to take into consideration with each approach, which I'll outline here.

Let's define a few things first:

  • All of the tests were run on .NET 4.0 on a 32-bit processor.
  • TimeSpan.TicksPerSecond on my machine = 10,000,000
  • All tests were performed in separate unit test sessions, not in the same one (so as not to possibly interfere with garbage collections, etc.)

Here's some helpers that are needed for each test:

The MyObject class:

public class MyObject {     public int IntValue { get; set; }     public double DoubleValue { get; set; } } 

A method to create a List<T> of any length of MyClass instances:

public static List<MyObject> CreateList(int items) {     // Validate parmaeters.     if (items < 0)          throw new ArgumentOutOfRangeException("items", items,              "The items parameter must be a non-negative value.");      // Return the items in a list.     return Enumerable.Range(0, items).         Select(i => new MyObject { IntValue = i, DoubleValue = i }).         ToList(); } 

An action to perform for each item in the list (needed because Method 2 uses a delegate, and a call needs to be made to something to measure impact):

public static void MyObjectAction(MyObject obj, TextWriter writer) {     // Validate parameters.     Debug.Assert(obj != null);     Debug.Assert(writer != null);      // Write.     writer.WriteLine("MyObject.IntValue: {0}, MyObject.DoubleValue: {1}",          obj.IntValue, obj.DoubleValue); } 

A method to create a TextWriter which writes to a null Stream (basically a data sink):

public static TextWriter CreateNullTextWriter() {     // Create a stream writer off a null stream.     return new StreamWriter(Stream.Null); } 

And let's fix the number of items at one million (1,000,000, which should be sufficiently high to enforce that generally, these all have about the same performance impact):

// The number of items to test. public const int ItemsToTest = 1000000; 

Let's get into the methods:

Method 1: foreach

The following code:

foreach(var item in myList)  {    //Do stuff } 

Compiles down into the following:

using (var enumerable = myList.GetEnumerable()) while (enumerable.MoveNext()) {     var item = enumerable.Current;      // Do stuff. } 

There's quite a bit going on there. You have the method calls (and it may or may not be against the IEnumerator<T> or IEnumerator interfaces, as the compiler respects duck-typing in this case) and your // Do stuff is hoisted into that while structure.

Here's the test to measure the performance:

[TestMethod] public void TestForEachKeyword() {     // Create the list.     List<MyObject> list = CreateList(ItemsToTest);      // Create the writer.     using (TextWriter writer = CreateNullTextWriter())     {         // Create the stopwatch.         Stopwatch s = Stopwatch.StartNew();          // Cycle through the items.         foreach (var item in list)         {             // Write the values.             MyObjectAction(item, writer);         }          // Write out the number of ticks.         Debug.WriteLine("Foreach loop ticks: {0}", s.ElapsedTicks);     } } 

The output:

Foreach loop ticks: 3210872841

Method 2: .ForEach method on List<T>

The code for the .ForEach method on List<T> looks something like this:

public void ForEach(Action<T> action) {     // Error handling omitted      // Cycle through the items, perform action.     for (int index = 0; index < Count; ++index)     {         // Perform action.         action(this[index]);     } } 

Note that this is functionally equivalent to Method 4, with one exception, the code that is hoisted into the for loop is passed as a delegate. This requires a dereference to get to the code that needs to be executed. While the performance of delegates has improved from .NET 3.0 on, that overhead is there.

However, it's negligible. The test to measure the performance:

[TestMethod] public void TestForEachMethod() {     // Create the list.     List<MyObject> list = CreateList(ItemsToTest);      // Create the writer.     using (TextWriter writer = CreateNullTextWriter())     {         // Create the stopwatch.         Stopwatch s = Stopwatch.StartNew();          // Cycle through the items.         list.ForEach(i => MyObjectAction(i, writer));          // Write out the number of ticks.         Debug.WriteLine("ForEach method ticks: {0}", s.ElapsedTicks);     } } 

The output:

ForEach method ticks: 3135132204

That's actually ~7.5 seconds faster than using the foreach loop. Not completely surprising, given that it uses direct array access instead of using IEnumerable<T>.

Remember though, this translates to 0.0000075740637 seconds per item being saved. That's not worth it for small lists of items.

Method 3: while (myList.MoveNext())

As shown in Method 1, this is exactly what the compiler does (with the addition of the using statement, which is good practice). You're not gaining anything here by unwinding the code yourself that the compiler would otherwise generate.

For kicks, let's do it anyways:

[TestMethod] public void TestEnumerator() {     // Create the list.     List<MyObject> list = CreateList(ItemsToTest);      // Create the writer.     using (TextWriter writer = CreateNullTextWriter())     // Get the enumerator.     using (IEnumerator<MyObject> enumerator = list.GetEnumerator())     {         // Create the stopwatch.         Stopwatch s = Stopwatch.StartNew();          // Cycle through the items.         while (enumerator.MoveNext())         {             // Write.             MyObjectAction(enumerator.Current, writer);         }          // Write out the number of ticks.         Debug.WriteLine("Enumerator loop ticks: {0}", s.ElapsedTicks);     } } 

The output:

Enumerator loop ticks: 3241289895

Method 4: for

In this particular case, you're going to gain some speed, as the list indexer is going directly to the underlying array to perform the lookup (that's an implementation detail, BTW, there's nothing to say that it can't be a tree structure backing the List<T> up).

[TestMethod] public void TestListIndexer() {     // Create the list.     List<MyObject> list = CreateList(ItemsToTest);      // Create the writer.     using (TextWriter writer = CreateNullTextWriter())     {         // Create the stopwatch.         Stopwatch s = Stopwatch.StartNew();          // Cycle by index.         for (int i = 0; i < list.Count; ++i)         {             // Get the item.             MyObject item = list[i];              // Perform the action.             MyObjectAction(item, writer);         }          // Write out the number of ticks.         Debug.WriteLine("List indexer loop ticks: {0}", s.ElapsedTicks);     } } 

The output:

List indexer loop ticks: 3039649305

However the place where this can make a difference is arrays. Arrays can be unwound by the compiler to process multiple items at a time.

Instead of doing ten iterations of one item in a ten item loop, the compiler can unwind this into five iterations of two items in a ten item loop.

However, I'm not positive here that this is actually happening (I have to look at the IL and the output of the compiled IL).

Here's the test:

[TestMethod] public void TestArray() {     // Create the list.     MyObject[] array = CreateList(ItemsToTest).ToArray();      // Create the writer.     using (TextWriter writer = CreateNullTextWriter())     {         // Create the stopwatch.         Stopwatch s = Stopwatch.StartNew();          // Cycle by index.         for (int i = 0; i < array.Length; ++i)         {             // Get the item.             MyObject item = array[i];              // Perform the action.             MyObjectAction(item, writer);         }          // Write out the number of ticks.         Debug.WriteLine("Enumerator loop ticks: {0}", s.ElapsedTicks);     } } 

The output:

Array loop ticks: 3102911316

It should be noted that out-of-the box, Resharper offers a suggestion with a refactoring to change the above for statements to foreach statements. That's not to say this is right, but the basis is to reduce the amount of technical debt in code.


TL;DR

You really shouldn't be concerned with the performance of these things, unless testing in your situation shows that you have a real bottleneck (and you'll have to have massive numbers of items to have an impact).

Generally, you should go for what's most maintainable, in which case, Method 1 (foreach) is the way to go.

like image 104
casperOne Avatar answered Sep 30 '22 04:09

casperOne