Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use which for?

EDIT Additional options and a slightly extended question below.

Consider this contrived and abstract example of a class body. It demonstrates four different ways of performing a "for" iteration.

private abstract class SomeClass
{
    public void someAction();
}

void Examples()
{
    List<SomeClass> someList = new List<SomeClass>();

    //A. for
    for (int i = 0; i < someList.Count(); i++)
    {
        someList[i].someAction();
    }

    //B. foreach
    foreach (SomeClass o in someList)
    {
        o.someAction();
    }

    //C. foreach extension
    someList.ForEach(o => o.someAction());

    //D. plinq
    someList.AsParallel().ForAll(o => o.someAction());

EDIT: Addition of some options from answers and research.

    //E. ParallelEnumerable
    ParallelEnumerable.Range(0, someList.Count - 1)
        .ForAll(i => someList[i].someAction());

    //F. ForEach Parallel Extension
    Parallel.ForEach(someList, o => o.someAction());

    //G. For Parallel Extension
    Parallel.For(0, someList.Count - 1, i => someList[i].someAction())
}

My question comes in two parts. Have I missed some significant option? Which option is the best choice, considering readability but primarily performance?

Please indicate if the complexity of the SomeClass implementation, or the Count of someList would effect this choice.

EDIT: With such a dizzying array of options, I wouldn't like my code to be spoilt by choice. To add a thrid part to my question, If my list could be any length should I default to a parallel option?

As a straw man. I suspect that over all implementations of SomeClass and all lengths of someList option //E. ParallelEnumerable would offer the best average performance, given the prevalanece of multi processor architechtures. I haven't done any testing to prove this.

Note: The parallel extensions will require the use of the System.Threading.Tasks namespace.

like image 370
Jodrell Avatar asked May 19 '11 12:05

Jodrell


3 Answers

you missed:

Parallel.ForEach(someList, o => o.someAction())
Parallel.For(0, someList.Length, i => someList[i].someAction())
like image 22
benwasd Avatar answered Oct 19 '22 16:10

benwasd


Option A only really makes sense for sequences that implement indexing and will only be performant for those that have O(1) lookup time. Generally, I would use the foreach and variants unless you have special logic.

Also note, that "special logic" like for (int i = 1; i < list.Count; i++) can be implemented with Linq extension methods: foreach(var item in sequence.Skip(1)).

So, generally prefer B over A.

As to C: This can be confusing for other developers if they aren't used to the functional style.

As to D: This will depend on a lot of factors. I guess for simple calculations, you don't want to do this - you will only really benefit from parallelization if the loop body takes a while to compute.

like image 85
Daren Thomas Avatar answered Oct 19 '22 16:10

Daren Thomas


The IL shows us that the for loop is the most efficient. There's no state machine to worry about.

for produces the following

IL_0036:  br.s        IL_0048
IL_0038:  ldloc.0     
IL_0039:  ldloc.1     
IL_003A:  callvirt    System.Collections.Generic.List<UserQuery+SomeClass>.get_Item
IL_003F:  callvirt    UserQuery+SomeClass.someAction
IL_0044:  ldloc.1     
IL_0045:  ldc.i4.1    
IL_0046:  add         
IL_0047:  stloc.1     
IL_0048:  ldloc.1     
IL_0049:  ldloc.0     
IL_004A:  call        System.Linq.Enumerable.Count
IL_004F:  blt.s       IL_0038

IL_0051: ret

The IL produced here for foreach shows the state machine at work. The LINQ version and the ForEach produce similar output.

IL_0035:  callvirt    System.Collections.Generic.List<UserQuery+SomeClass>.GetEnumerator
IL_003A:  stloc.3     
IL_003B:  br.s        IL_004B
IL_003D:  ldloca.s    03 
IL_003F:  call        System.Collections.Generic.List<UserQuery+SomeClass>.get_Current
IL_0044:  stloc.1     
IL_0045:  ldloc.1     
IL_0046:  callvirt    UserQuery+SomeClass.someAction
IL_004B:  ldloca.s    03 
IL_004D:  call        System.Collections.Generic.List<UserQuery+SomeClass>.MoveNext
IL_0052:  brtrue.s    IL_003D
IL_0054:  leave.s     IL_0064
IL_0056:  ldloca.s    03 
IL_0058:  constrained. System.Collections.Generic.List<>.Enumerator
IL_005E:  callvirt    System.IDisposable.Dispose
IL_0063:  endfinally  
IL_0064:  ret   

I haven't done any tests but I think it's a safe assumption.

That being said, it doesn't mean for keyword should be used always. It all depends on your style, your teams style or if that piece of code your writing really needs every CPU cycle you can get your hands on.

I don't think I would compare AsParallel() with the for, foreach or the lambda equivalents. You'd split up CPU intensive tasks or blocking operations using AsParallel(), you wouldn't use it just iterating over a "normal" collection.

like image 34
Razor Avatar answered Oct 19 '22 14:10

Razor