Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing away OrderBy() when using Any()

So I have a fairly standard LINQ-to-Object setup.

var query = expensiveSrc.Where(x=> x.HasFoo)
                        .OrderBy(y => y.Bar.Count())
                        .Select(z => z.FrobberName);    

// ...

if (!condition && !query.Any())
 return; // seems to enumerate and sort entire enumerable 

// ...

foreach (var item in query)
   // ...

This enumerates everything twice. Which is bad.

var queryFiltered = expensiveSrc.Where(x=> x.HasFoo);

var query = queryFiltered.OrderBy(y => y.Bar.Count())
                         .Select(z => z.FrobberName); 

if (!condition && !queryFiltered.Any())
   return;

// ...

foreach (var item in query)
   // ...

Works, but is there a better way?

Would there be any non-insane way to "enlighten" Any() to bypass the non-required operations? I think I remember this sort of optimisation going into EduLinq.

like image 333
Fowl Avatar asked Jan 24 '12 00:01

Fowl


People also ask

How will you increase the performance of sorting in Oracle?

The SORT_AREA_SIZE parameter controls the memory available for sorting for ORDER BY queries. You should increase the size of this parameter if you frequently order by structured columns. See Also: Oracle Database Administrator's Guide for more information on setting SGA related parameters.


2 Answers

Why not just get rid of the redundant:

if (!query.Any())
 return; 

It really doesn't seem to be serving any purpose - even without it, the body of the foreach won't execute if the query yields no results. So with the Any() check in, you save nothing in the fast path, and enumerate twice in the slow path.

On the other hand, if you must know if there were any results found after the end of the loop, you might as well just use a flag:

bool itemFound = false;

foreach (var item in query)
{
    itemFound = true;
    ... // Rest of the loop body goes here.
}

if(itemFound)
{
   // ...
}

Or you could use the enumerator directly if you're really concerned about the redundant flag-setting in the loop body:

using(var erator = query.GetEnumerator())
{
    bool itemFound = erator.MoveNext();

    if(itemFound)
    {
       do
       {
           // Do something with erator.Current;
       } while(erator.MoveNext())
    }

   // Do something with itemFound
}
like image 183
Ani Avatar answered Sep 21 '22 03:09

Ani


There is not much information that can be extracted from an enumerable, so maybe it's better to turn the query into an IQueryable? This Any extension method walks down its expression tree skipping all irrelevant operations, then it turns the important branch into a delegate that can be called to obtain an optimized IQueryable. Standard Any method applied to it explicitly to avoid recursion. Not sure about corner cases, and maybe it makes sense to cache compiled queries, but with simple queries like yours it seems to work.

static class QueryableHelper {
    public static bool Any<T>(this IQueryable<T> source) {
        var e = source.Expression;
        while (e is MethodCallExpression) {
            var mce = e as MethodCallExpression;
            switch (mce.Method.Name) {
                case "Select":
                case "OrderBy":
                case "ThenBy": break;
                default: goto dun;
            }
            e = mce.Arguments.First();
        }
        dun:
        var d = Expression.Lambda<Func<IQueryable<T>>>(e).Compile();
        return Queryable.Any(d());
    }
}

Queries themselves must be modified like this:

var query = expensiveSrc.AsQueryable()
                        .Where(x=> x.HasFoo)
                        .OrderBy(y => y.Bar.Count())
                        .Select(z => z.FrobberName); 
like image 26
user1096188 Avatar answered Sep 24 '22 03:09

user1096188