Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ aggregate and group by periods of time

I'm trying to understand how LINQ can be used to group data by intervals of time; and then ideally aggregate each group.

Finding numerous examples with explicit date ranges, I'm trying to group by periods such as 5-minutes, 1-hour, 1-day.

For example, I have a class that wraps a DateTime with a value:

public class Sample
{
     public DateTime timestamp;
     public double value;
}

These observations are contained as a series in a List collection:

List<Sample> series;

So, to group by hourly periods of time and aggregate value by average, I'm trying to do something like:

var grouped = from s in series
              group s by new TimeSpan(1, 0, 0) into g
              select new { timestamp = g.Key, value = g.Average(s => s.value };

This is fundamentally flawed, as it groups the TimeSpan itself. I can't understand how to use the TimeSpan (or any data type representing an interval) in the query.

like image 234
Jason Sturges Avatar asked Jan 13 '12 19:01

Jason Sturges


5 Answers

You could round the time stamp to the next boundary (i.e. down to the closest 5 minute boundary in the past) and use that as your grouping:

var groups = series.GroupBy(x =>
{
    var stamp = x.timestamp;
    stamp = stamp.AddMinutes(-(stamp.Minute % 5));
    stamp = stamp.AddMilliseconds(-stamp.Millisecond - 1000 * stamp.Second);
    return stamp;
})
.Select(g => new { TimeStamp = g.Key, Value = g.Average(s => s.value) })
.ToList();

Above achieves that by using a modified time stamp in the grouping, which sets the minutes to the previous 5 minute boundary and removes the seconds and milliseconds. The same approach of course can be used for other time periods, i.e. hours and days.

Edit:

Based on this made up sample input:

var series = new List<Sample>();
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(3) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(4) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(5) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(6) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(7) });
series.Add(new Sample() { timestamp = DateTime.Now.AddMinutes(15) });

3 groups were produced for me, one with grouping timestamp 3:05, one with 3:10 and one with 3:20 pm (your results may vary based on current time).

like image 102
BrokenGlass Avatar answered Oct 19 '22 08:10

BrokenGlass


I'm very late to the game on this one, but I came accross this while searching for something else, and I thought i had a better way.

series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
        .Select (s => new {
            series = s
            ,timestamp = s.First ().timestamp
            ,average = s.Average (x => x.value )
        }).Dump();

Here is a sample linqpad program so you can validate and test

void Main()
{
    List<Sample> series = new List<Sample>();

    Random random = new Random(DateTime.Now.Millisecond);
    for (DateTime i = DateTime.Now.AddDays(-5); i < DateTime.Now; i += TimeSpan.FromMinutes(1))
    {
        series.Add(new UserQuery.Sample(){ timestamp = i, value = random.NextDouble() * 100 });
    }
    //series.Dump();
    series.GroupBy (s => s.timestamp.Ticks / TimeSpan.FromHours(1).Ticks)
        .Select (s => new {
            series = s
            ,timestamp = s.First ().timestamp
            ,average = s.Average (x => x.value )
        }).Dump();
}

// Define other methods and classes here
public class Sample
{
     public DateTime timestamp;
     public double value;
}
like image 26
Duane McKinney Avatar answered Oct 19 '22 09:10

Duane McKinney


For grouping by hour you need to group by the hour part of your timestamp which could be done as so:

var groups = from s in series
  let groupKey = new DateTime(s.timestamp.Year, s.timestamp.Month, s.timestamp.Day, s.timestamp.Hour, 0, 0)
  group s by groupKey into g select new
                                      {
                                        TimeStamp = g.Key,
                                        Value = g.Average(a=>a.value)
                                      };
like image 2
Michael Avatar answered Oct 19 '22 07:10

Michael


I'd suggest using new DateTime() to avoid any issues with sub millisecond differences

var versionsGroupedByRoundedTimeAndAuthor = db.Versions.GroupBy(g => 
new
{
                UserID = g.Author.ID,
                Time = RoundUp(g.Timestamp, TimeSpan.FromMinutes(2))
});

With

  private DateTime RoundUp(DateTime dt, TimeSpan d)
        {
            return new DateTime(((dt.Ticks + d.Ticks - 1) / d.Ticks) * d.Ticks);
        }

N.B. I am here grouping by Author.ID as well as the rounded TimeStamp.

RoundUp function taken from @dtb answer here https://stackoverflow.com/a/7029464/661584

Read about how equality down to the millisecond doesn't always mean equality here Why does this unit test fail when testing DateTime equality?

like image 2
MemeDeveloper Avatar answered Oct 19 '22 07:10

MemeDeveloper


I improved on BrokenGlass's answer by making it more generic and added safeguards. With his current answer, if you chose an interval of 9, it will not do what you'd expect. The same goes for any number 60 is not divisible by. For this example, I'm using 9 and starting at midnight (0:00).

  • Everything from 0:00 to 0:08.999 will be put into a group of 0:00 as you'd expect. It will keep doing this until you get to the grouping that starts at 0:54.
  • At 0:54, it will only group things from 0:54 to 0:59.999 instead of going up to 01:03.999.

For me, this is a massive issue.

I'm not sure how to fix that, but you can add safeguards.
Changes:

  1. Any minute where 60 % [interval] equals 0 will be an acceptable interval. The if statements below safeguard this.
  2. Hour intervals work as well.

            double minIntervalAsDouble = Convert.ToDouble(minInterval);
            if (minIntervalAsDouble <= 0)
            {
                string message = "minInterval must be a positive number, exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            else if (minIntervalAsDouble < 60.0 && 60.0 % minIntervalAsDouble != 0)
            {
                string message = "60 must be divisible by minInterval...exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            else if (minIntervalAsDouble >= 60.0 && (24.0 % (minIntervalAsDouble / 60.0)) != 0 && (24.0 % (minIntervalAsDouble / 60.0) != 24.0))
            {
                //hour part must be divisible...
                string message = "If minInterval is greater than 60, 24 must be divisible by minInterval/60 (hour value)...exiting";
                Log.getInstance().Info(message);
                throw new Exception(message);
            }
            var groups = datas.GroupBy(x =>
            {
                if (minInterval < 60)
                {
                    var stamp = x.Created;
                    stamp = stamp.AddMinutes(-(stamp.Minute % minInterval));
                    stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                    stamp = stamp.AddSeconds(-stamp.Second);
                    return stamp;
                }
                else
                {
                    var stamp = x.Created;
                    int hourValue = minInterval / 60;
                    stamp = stamp.AddHours(-(stamp.Hour % hourValue));
                    stamp = stamp.AddMilliseconds(-stamp.Millisecond);
                    stamp = stamp.AddSeconds(-stamp.Second);
                    stamp = stamp.AddMinutes(-stamp.Minute);
                    return stamp;
                }
            }).Select(o => new
            {
                o.Key,
                min = o.Min(f=>f.Created),
                max = o.Max(f=>f.Created),
                o
            }).ToList();
    

Put whatever you'd like in the select statement! I put in min/max because it was easier to test it.

like image 1
Migit Avatar answered Oct 19 '22 08:10

Migit