Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of FirstOrDefault() [duplicate]

Tags:

c#

.net

linq

Looking at a section of the webapp I work on today with a performance profiler. I thought that a Union was causing some delays, but found other surprising results instead.

One of the causes of slowdown appeared to be FirstOrDefault.

It was a very simple LINQ query that looked like this:

foreach(Report r in reports)
    IDTOStudy study = studies.FirstOrDefault(s => s.StudyID == r.StudyID);

I created a small method to duplicate the behaviour I figured FirstOrDefault was doing.

private IDTOStudy GetMatchingStudy(Report report, IList<IDTOStudy> studies)
{
    foreach (var study in studies)
    if (study.StudyID == report.StudyID)
        return study;

    return null;
}

This method replaced the FirstOrDefault to look like this:

foreach(Report r in reports)
    IDTOStudy study = GetMatchingStudy(r, studies);

Looking at the new code running with the performance profiler showed FirstOrDefault to take twice as long to complete as my new method. This was a shock to see.

I must be doing something incorrect with the FirstOrDefault() query. What is it?

Does FirstOrDefault() complete the entire query and then take the first element?

How can I speed this up and use FirstOrDefault()?

Edit 1:

One additional point I noticed is that the profiler says I'm maxing out my CPU on both of these implementations. That's also something I don't care for and didn't expect. The additional method I added didn't reduce that spike, just cut its duration in half.

Edit 3:

Putting the studies into a dictionary has improved the run time tremendously. It's definitely going to be how the committed code looks. Doesn't answer the question on FirstOrDefault though.

Edit 2:

Here's the sample code requested in a simple console app. My run still shows that in most cases FirstOrDefault takes longer.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Reactive.Linq;
using System.Reactive.Concurrency;
using System.Diagnostics;

namespace TestCode
{
    public class Program
    {
        public List<IntHolder> list;

        public static void Main(string[] args)
        {
            var prog = new Program();
            prog.list = new List<IntHolder>();

            prog.Add50000Items();
            prog.list.Add(new IntHolder() { Num = 12345 });
            prog.Add50000Items();

            var stopwatch = new Stopwatch();
            stopwatch.Start();
            prog.list.FirstOrDefault(n => n.Num == 12345);
            stopwatch.Stop();

            Console.WriteLine("First run took: " + stopwatch.ElapsedTicks);
            var lookingFor = new IntHolder() { Num = 12345 };

            stopwatch.Reset();
            stopwatch.Start();
            prog.GetMatching(lookingFor);
            stopwatch.Stop();
            Console.WriteLine("Second run took: " + stopwatch.ElapsedTicks);
            Console.ReadLine();
        }

        public void Add50000Items()
        {
            var rand = new Random();

            for (int i = 0; i < 50000; i++)
                list.Add(new IntHolder() { Num = rand.Next(100000) });
        }

        public IntHolder GetMatching(IntHolder num)
        {
            foreach (var number in list)
                if (number.Num == num.Num)
                    return number;

            return null;
        }
    }

    public class IntHolder
    {
        public int Num { get; set; }
    }
}
like image 764
Chris Avatar asked Apr 18 '13 20:04

Chris


People also ask

Is FirstOrDefault slow?

The query is executed when you read from the expression. In this case the FirstOrDefault will execute the whole expression tree, so it's not the FirstOrDefault thats slow.

Is FirstOrDefault faster than first?

Performance. First() and FirstOrDefault() are faster than Single() and SingleOrDefault() because First find the first() match only whereas Single() search element in a whole collection of elements.

Which is faster SingleOrDefault or FirstOrDefault?

FirstOrDefault usually perform faster as compared SingleOrDefault, since these iterate the collection until they find the first match. While SingleOrDefault iterate the whole collection to find one single match.

What does FirstOrDefault return if empty?

FirstOrDefault<TSource>(IEnumerable<TSource>) Returns the first element of a sequence, or a default value if the sequence contains no elements.


1 Answers

What i think is happening (although it would be good to get a little extra info on your specific scenario, my assumption is that this is a DB scenario based on your DTO classes) is the following:

foreach(Report r in reports)
    IDTOStudy study = studies.FirstOrDefault(s => s.StudyID == r.StudyID); //Database query happens here for each report


//The whole studies table is loaded into memory which means you only do one DB query and the actual firstordefault stuff is done in memory which is quicker than going over the network
private IDTOStudy GetMatchingStudy(Report report, IList<IDTOStudy> studies)
{
    foreach (var study in studies)
    if (study.StudyID == report.StudyID)
        return study;

    return null;
}

What this means is that in the second example you have optimized for database roundtrips (which is a good idea).

You can prove this theory by checking out the database queries happening behind the scenes with something like SQL Profiler

like image 164
Not loved Avatar answered Oct 15 '22 14:10

Not loved