Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is IQueryable twice as fast than IEnumerable when using Linq To Objects

I know the difference between IQueryable and IEnumerable, and I know that collections are supported by Linq To Objects via the IEnumerable interface.

What puzzles me is that queries are executed twice as fast when the collection is converted to a IQueryable.

Let l be a filled object of type List, then a linq query is twice time as fast if the list l is converted to an IQueryable via l.AsQueryable().

I have written a simple test with VS2010SP1 and .NET 4.0 that demonstrates this:

private void Test()
{
  const int numTests = 1;
  const int size = 1000 * 1000;
  var l = new List<int>();
  var resTimesEnumerable = new List<long>();
  var resTimesQueryable = new List<long>();
  System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();

  for ( int x=0; x<size; x++ )
  {
    l.Add( x );
  }

  Console.WriteLine( "Testdata size: {0} numbers", size );
  Console.WriteLine( "Testdata iterations: {0}", numTests );

  for ( int n = 0; n < numTests; n++ )
  {
    sw.Restart();
    var result = from i in l.AsEnumerable() where (i % 10) == 0 && (i % 3) != 0 select i;
    result.ToList();
    sw.Stop();
    resTimesEnumerable.Add( sw.ElapsedMilliseconds );
  }
  Console.WriteLine( "TestEnumerable" );
  Console.WriteLine( "  Min: {0}", Enumerable.Min( resTimesEnumerable ) );
  Console.WriteLine( "  Max: {0}", Enumerable.Max( resTimesEnumerable ) );
  Console.WriteLine( "  Avg: {0}", Enumerable.Average( resTimesEnumerable ) );

  for ( int n = 0; n < numTests; n++ )
  {
    sw.Restart();
    var result = from i in l.AsQueryable() where (i % 10) == 0 && (i % 3) != 0 select i;
    result.ToList();
    sw.Stop();
    resTimesQueryable.Add( sw.ElapsedMilliseconds );
  }
  Console.WriteLine( "TestQuerable" );
  Console.WriteLine( "  Min: {0}", Enumerable.Min( resTimesQueryable ) );
  Console.WriteLine( "  Max: {0}", Enumerable.Max( resTimesQueryable ) );
  Console.WriteLine( "  Avg: {0}", Enumerable.Average( resTimesQueryable ) );
}

Running this test (with will numTests == 1 and 10) produces the following output:

Testdata size: 1000000 numbers
Testdata iterations: 1
TestEnumerable
  Min: 44
  Max: 44
  Avg: 44
TestQuerable
  Min: 37
  Max: 37
  Avg: 37

Testdata size: 1000000 numbers
Testdata iterations: 10
TestEnumerable
  Min: 22
  Max: 29
  Avg: 23,9
TestQuerable
  Min: 12
  Max: 22
  Avg: 13,9

Repeating the test but switching the order (i.e. first measuring IQuerable and then IEnumerable) gives differenct results!

Testdata size: 1000000 numbers
Testdata iterations: 1
TestQuerable
  Min: 75
  Max: 75
  Avg: 75
TestEnumerable
  Min: 25
  Max: 25
  Avg: 25

Testdata size: 1000000 numbers
Testdata iterations: 10
TestQuerable
  Min: 12
  Max: 28
  Avg: 14
TestEnumerable
  Min: 22
  Max: 26
  Avg: 23,4

Here are my questions:

  1. What am I doing wrong?
  2. Why is IEnumerable faster if the test is executed after the IQueryable test?
  3. Why is IQueryable faster when the no. of test runs is increased?
  4. Is there a penalty involved using IQueryable instead of IEnumerable?

I ask these questions because I was wondering which one to use for my Repository Interface. Right now they query collections in memory (Linq to Objects), but in the future this might be an SQL datasource. If I design the repository classes now with IQueryable I can painlessly switch later to Linq to SQL. However if there is a performance penalty invloved then sticking to IEnumerable while there is no SQL involved seems to be wiser.

like image 426
rbu Avatar asked Jun 14 '12 12:06

rbu


1 Answers

Using linqpad to examine the IL code, here's what I'm seeing:

For this code:

var l = Enumerable.Range(0,100);

var result = from i in l.AsEnumerable() where (i % 10) == 0 && (i % 3) != 0 select i;

This is generated:

IL_0001:  ldc.i4.0    
IL_0002:  ldc.i4.s    64 
IL_0004:  call        System.Linq.Enumerable.Range
IL_0009:  stloc.0     
IL_000A:  ldloc.0     
IL_000B:  call        System.Linq.Enumerable.AsEnumerable
IL_0010:  ldsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate1
IL_0015:  brtrue.s    IL_002A
IL_0017:  ldnull      
IL_0018:  ldftn       b__0
IL_001E:  newobj      System.Func<System.Int32,System.Boolean>..ctor
IL_0023:  stsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate1
IL_0028:  br.s        IL_002A
IL_002A:  ldsfld      UserQuery.CS$<>9__CachedAnonymousMethodDelegate1
IL_002F:  call        System.Linq.Enumerable.Where
IL_0034:  stloc.1     

b__0:
IL_0000:  ldarg.0     
IL_0001:  ldc.i4.s    0A 
IL_0003:  rem         
IL_0004:  brtrue.s    IL_0011
IL_0006:  ldarg.0     
IL_0007:  ldc.i4.3    
IL_0008:  rem         
IL_0009:  ldc.i4.0    
IL_000A:  ceq         
IL_000C:  ldc.i4.0    
IL_000D:  ceq         
IL_000F:  br.s        IL_0012
IL_0011:  ldc.i4.0    
IL_0012:  stloc.0     
IL_0013:  br.s        IL_0015
IL_0015:  ldloc.0     
IL_0016:  ret         

And for this code:

var l = Enumerable.Range(0,100);

var result = from i in l.AsQueryable() where (i % 10) == 0 && (i % 3) != 0 select i;

We get this:

IL_0001:  ldc.i4.0    
IL_0002:  ldc.i4.s    64 
IL_0004:  call        System.Linq.Enumerable.Range
IL_0009:  stloc.0     
IL_000A:  ldloc.0     
IL_000B:  call        System.Linq.Queryable.AsQueryable
IL_0010:  ldtoken     System.Int32
IL_0015:  call        System.Type.GetTypeFromHandle
IL_001A:  ldstr       "i"
IL_001F:  call        System.Linq.Expressions.Expression.Parameter
IL_0024:  stloc.2     
IL_0025:  ldloc.2     
IL_0026:  ldc.i4.s    0A 
IL_0028:  box         System.Int32
IL_002D:  ldtoken     System.Int32
IL_0032:  call        System.Type.GetTypeFromHandle
IL_0037:  call        System.Linq.Expressions.Expression.Constant
IL_003C:  call        System.Linq.Expressions.Expression.Modulo
IL_0041:  ldc.i4.0    
IL_0042:  box         System.Int32
IL_0047:  ldtoken     System.Int32
IL_004C:  call        System.Type.GetTypeFromHandle
IL_0051:  call        System.Linq.Expressions.Expression.Constant
IL_0056:  call        System.Linq.Expressions.Expression.Equal
IL_005B:  ldloc.2     
IL_005C:  ldc.i4.3    
IL_005D:  box         System.Int32
IL_0062:  ldtoken     System.Int32
IL_0067:  call        System.Type.GetTypeFromHandle
IL_006C:  call        System.Linq.Expressions.Expression.Constant
IL_0071:  call        System.Linq.Expressions.Expression.Modulo
IL_0076:  ldc.i4.0    
IL_0077:  box         System.Int32
IL_007C:  ldtoken     System.Int32
IL_0081:  call        System.Type.GetTypeFromHandle
IL_0086:  call        System.Linq.Expressions.Expression.Constant
IL_008B:  call        System.Linq.Expressions.Expression.NotEqual
IL_0090:  call        System.Linq.Expressions.Expression.AndAlso
IL_0095:  ldc.i4.1    
IL_0096:  newarr      System.Linq.Expressions.ParameterExpression
IL_009B:  stloc.3     
IL_009C:  ldloc.3     
IL_009D:  ldc.i4.0    
IL_009E:  ldloc.2     
IL_009F:  stelem.ref  
IL_00A0:  ldloc.3     
IL_00A1:  call        System.Linq.Expressions.Expression.Lambda
IL_00A6:  call        System.Linq.Queryable.Where
IL_00AB:  stloc.1     

So it would appear that the difference is the AsQuerable version is constructing an expression tree, AsEnumerable does not.

like image 58
asawyer Avatar answered Dec 11 '22 20:12

asawyer