Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the preferred (performant and readable) way of chaining IEnumerable<T> extension methods?

If I'm trying to filter results at multiple levels of an IEnumerable<T> object graph, is there a preferred way of chaining extension methods to do this?

I'm open to any extension method and lambda usage, but I'd prefer not to use LINQ syntax to remain consistent with the rest of the codebase.

Is it better to push the filtering to the selector of the SelectMany() method or just to chain another Where() method? Or is there a better solution?

How would I go about identifying the best option? In this test case, everything is directly available in memory. Obviously both samples below are currently producing the same correct results; I'm just looking for a reason one or the other (or another option) would be preferred.

public class Test
{
    // I want the first chapter of a book that's exactly 42 pages, written by
    // an author whose name is Adams, from a library in London.
    public Chapter TestingIEnumerableTExtensionMethods()
    {
        List<Library> libraries = GetLibraries();

        Chapter chapter = libraries
            .Where(lib => lib.City == "London")
            .SelectMany(lib => lib.Books)
            .Where(b => b.Author == "Adams")
            .SelectMany(b => b.Chapters)
            .First(c => c.NumberOfPages == 42);

        Chapter chapter2 = libraries
            .Where(lib => lib.City == "London")
            .SelectMany(lib => lib.Books.Where(b => b.Author == "Adams"))
            .SelectMany(b => b.Chapters.Where(c => c.NumberOfPages == 42))
            .First();
    }

And here's the sample object graph:

public class Library
{
    public string Name { get; set; }
    public string City { get; set; }
    public List<Book> Books { get; set; }
}

public class Book
{
    public string Name { get; set; }
    public string Author { get; set; }
    public List<Chapter> Chapters { get; set; }
}

public class Chapter
{
    public string Name { get; set; }
    public int NumberOfPages { get; set; }
}
like image 595
BQ. Avatar asked Apr 09 '12 19:04

BQ.


3 Answers

Which is best likely varies based on the LINQ implementation you're using. LinqToSql will behave differently from in-memory filtering. The order of the clauses should impact the performance depending on what data is used, since naive implementations will filter more records earlier in the sequence meaning less work for the later methods.

For your two examples, I would guess that the performance difference is negligible and would favor the first since it allows easier modification of each clause independent of the others.

As for determining the best option, it's the same as anything else: measure.

like image 96
Telastyn Avatar answered Oct 15 '22 20:10

Telastyn


I'm guessing the first expression you have will be slightly but insignificantly faster. To really determine if one or the other is faster, you will need to time them, with a profiler or Stopwatch.

The readability doesn't seem to be strongly affected either way. I prefer the first approach, as it has less levels of nesting. It all depends on your personal preference.

like image 36
Kendall Frey Avatar answered Oct 15 '22 21:10

Kendall Frey


It depends on how the underlying LINQ provider works. For LINQ to Objects, both in this case would require about the same amount of work, more or less. But that's the most straightforward (simplest) example, so beyond that it's hard to say.

like image 45
Rex M Avatar answered Oct 15 '22 22:10

Rex M