Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient LINQ to Entities query

I have an entity collection of Readings. Each Reading is linked to an entity called Meter. (And each Meter holds multiple readings). each Reading holds a field for meter id (int) and a field for time.

Here is some simplified code to demonstrate it:

public class Reading
{
    int Id;
    int meterId;
    DateTime time;
}

public class Meter
{
    int id;
    ICollection<Readings> readings;    
}

Given a specific period and list of meterids, what would be the most efficient way to get for each Meter the first and last reading in that time period?

I am able to iterate through all meters and for each meter to obatin first and last reading for the period, but I was wandering if there is a more efficient way to acheive this.

And a bonus question: same question, but with multiple periods of time to get data for, instead of just one period.

like image 976
omer schleifer Avatar asked May 27 '13 06:05

omer schleifer


People also ask

Are LINQ queries faster?

Most of the times, LINQ will be a bit slower because it introduces overhead. Do not use LINQ if you care much about performance. Use LINQ because you want shorter better readable and maintainable code. So your experience is that LINQ is faster and makes code harder to read and to maintain?

Which entities can LINQ use to perform queries?

LINQ to Entities queries are comprised of LINQ standard query operators (such as Select, Where, and GroupBy) and expressions (x > 10, Contact. LastName, and so on). LINQ operators are not defined by a class, but rather are methods on a class.


2 Answers

I am not exactly sure how you want this data, but you could project it into an anonymous type:

var metersFirstAndLastReading = meters.Select(m => new 
    {
        Meter = m,
        FirstReading = m.readings.OrderBy(r => r.time).First(),
        LastReading = m.readings.OrderBy(r => r.time).Last()
    });

You can then read your result list like this (this example is just meant as an illustration):

foreach(var currentReading in metersFirstAndLastReading)
{
    string printReadings = String.Format("Meter id {0}, First = {1}, Last = {2}", 
                               currentReading.Meter.id.ToString(),
                               currentReading.FirstReading.time.ToString(),
                               currentReading.LastReading.time.ToString());

    // Do something...
}

Another option would be to create properties in Meter which dynamically return the first and last readings:

public class Meter
{
    public int id;
    public List<Reading> readings;

    public Reading FirstReading 
    {
        get
        {
            return readings.OrderBy(r => r.time).First();
        }
    }

    public Reading LastReading
    {
        get
        {
            return readings.OrderBy(r => r.time).Last();
        }
    }
}

EDIT: I misunderstood the question a little.

Here is the implementation to determine the first and last readings for a meter including a date range (assuming meterIdList is an ICollection<int> of IDs and begin and end is the specified date range)

var metersFirstAndLastReading = meters
    .Where(m => meterIdList.Contains(m.id))
    .Select(m => new 
    {
        Meter = m,
        FirstReading = m.readings
                        .Where(r => r.time >= begin && r.time <= end)
                        .OrderBy(r => r.time)
                        .FirstOrDefault(),
        LastReading = m.readings
                        .Where(r => r.time >= begin && r.time <= end)
                        .OrderByDescending(r => r.time)
                        .FirstOrDefault()
    });

You won't be able to use properties now (as you need to supply parameters) so methods will work just fine as an alternative:

public class Meter
{
    public int id;
    public List<Reading> readings;

    public Reading GetFirstReading(DateTime begin, DateTime end)
    {
        var filteredReadings = readings.Where(r => r.time >= begin && r.time <= end);

        if(!HasReadings(begin, end))
        {
            throw new ArgumentOutOfRangeException("No readings available during this period");
        }

        return filteredReadings.OrderBy(r => r.time).First();
    }

    public Reading GetLastReading(DateTime begin, DateTime end)
    {
        var filteredReadings = readings.Where(r => r.time >= begin && r.time <= end);

        if(!HasReadings(begin, end))
        {
            throw new ArgumentOutOfRangeException("No readings available during this period");
        }

        return filteredReadings.OrderBy(r => r.time).Last();
    }

    public bool HasReadings(DateTime begin, DateTime end)
    {
        return readings.Any(r => r.time >= begin && r.time <= end);
    }
}
like image 104
Dave New Avatar answered Oct 27 '22 02:10

Dave New


I have a very similar data model where this code is used to get the oldest readings, i just changed it to also include the newest.

I use query syntax to do something like this:

var query = from reading in db.Readings
            group reading by reading.meterId
            into readingsPerMeter
            let oldestReadingPerMeter = readingsPerMeter.Min(g => g.time)
            let newestReadingPerMeter = readingsPerMeter.Max(g => g.time)
            from reading in readingsPerMeter
            where reading.time == oldestReadingPerMeter || reading.time == newestReadingPerMeter 
            select reading; //returns IQueryable<Reading> 

That would result in a only the newest and oldest reading for each meter.

The reason i think this is efficient is because its one lookup to the DB to get all the readings for each meter, instead of several lookups for each meter. We have ~40000 meters with ~30mil readings. i just tested the lookup on our data it took about 10s

The sql preformed is a crossjoin between two sub selects for each of the min and max dates.

UPDATE:

Since this is queryable you should be able to supply a period after, like this:

query.Where(r=>r.time > someTime1 && r.time < someTime2)

Or put it into the original query, i just like it seperated like this. The query isnt executed yet since we havent performed an action that fetches the data yet.

like image 1
Jim Wolff Avatar answered Oct 27 '22 01:10

Jim Wolff