Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient way to obtain an ordered range using EntityFramework?

Predicament

I have a large dataset that I need to perform a complex calculation upon a part of. For my calculation, I need to take a chunk of ordered data from a large set based on input parameters.

My method signature looks like this:

double Process(Entity e, DateTimeOffset? start, DateTimeOffset? end)

Potential Solutions

The two following methods spring to mind:

Method 1 - WHERE Clause

double result = 0d;
IEnumerable<Quote> items = from item in e.Items
                           where (!start.HasValue || item.Date >= start.Value)
                              && (!end.HasValue || item.Date <= end.Value)
                           orderby item.Date ascending
                           select item;
...
return result;

Method 2 - Skip & Take

double result = 0d;
IEnumerable<Item> items = e.Items.OrderBy(i => i.Date);
if (start.HasValue)
    items = items.SkipWhile(i => i.Date < start.Value);
if (end.HasValue)
    items = items.TakeWhile(i => i.Date <= end.Value);
...
return result;

Question

If I were just throwing this together, I'd probably just go with Method 1, but the size of my dataset and the the size of the set of datasets are both too large to ignore slight efficiency losses, and it is of vital importance that the resulting enumerable is ordered.

Which approach will generate the more efficient query? And is there a more efficient approach that I am yet to consider?

Any solutions presented can make the safe assumption that the table is well indexed.

like image 645
mirichan Avatar asked Apr 20 '15 09:04

mirichan


People also ask

Which of the given code snippet will you use to count related entities without loading them?

Single(blog=> blog.Id = yourCriteriaId). Posts. Count();

What is the difference between Entity Framework and LINQ?

Entity Framework is an object-relational mapping (ORM) framework for connecting C# code to external databases, usually SQL Server. LINQ is a query language embedded into C# and a set of extension methods in order to make it useful.


1 Answers

SkipWhile is not supported for translation to SQL. You need to throw that option away.

The best way to go about this is to create an index on the field you use to range select and then issue a query that is SARGable. where date >= start && date < end is SARGable and can make use of an index.

!start.HasValue || is not a good idea because that destroys SARGability. Build the query so that this is not needed. For example:

if(start != null) query = query.Where(...);

Make the index covering and you have optimal performance. There is no single extra row that needs to be processed.

like image 155
usr Avatar answered Nov 20 '22 01:11

usr