Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL group by frequency within a date range

I have a requirement to write a stored procedure that accepts a start date, end date and a frequency (day, week, month, quarter, year) and outputs a result set based on those parameters. Obviously, the simple part is the query by date range, but how do you group by frequency?

So if have a set of raw data like this:

Date            Count
---------------------
11/15/2011          6
12/16/2011          9
12/17/2011          2
12/18/2011          1
12/18/2011          4

And I call my stored proc like this:

sp_Report '1/1/2011', '12/31/2011', 'week'

I would expect results like this:

WeekOf          Count
---------------------
11/19/2011          6
12/17/2011         11
12/24/2011          5

There are a couple of questions here:

1) How do I determine the date for the end of the week (week ending on Sunday)?

2) How do I group by that WeekOf date range?

like image 959
Scott Avatar asked Dec 27 '11 00:12

Scott


People also ask

Can you GROUP BY date in SQL?

To group by date part, use the GROUP BY clause and the EXTRACT() function. Pass EXTRACT() the date parts to isolate.

How do I pass a date range in SQL query?

SELECT * FROM PERSONAL WHERE BIRTH_DATE_TIME BETWEEN '2000-01-01 00:00:00' AND '2002-09-18 12:00:00';

What does Group_by do in SQL?

The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions ( COUNT() , MAX() , MIN() , SUM() , AVG() ) to group the result-set by one or more columns.

Can GROUP BY used with where clause?

GROUP BY Clause is utilized with the SELECT statement. GROUP BY aggregates the results on the basis of selected column: COUNT, MAX, MIN, SUM, AVG, etc. GROUP BY returns only one result per group of data. GROUP BY Clause always follows the WHERE Clause.


1 Answers

The following script represents the output in a unified way: it shows period's start and end dates as well as the total count for the period.

That has also determined the ways of finding the values to group by. Basically, you can see three distinct patterns: one for the 'day' frequency , another one for 'week' and still another for all the other frequency types.

The first one is simplest: both PeriodStart and PeriodEnd are just Date.

For weeks, I'm using a quite well known trick, whereby the first day of week is derived from the given date by subtracting from it a value that is one less than its weekday number. The end of the week is found similarly: we are merely adding 6 to the same expression.

Months, quarters and years are grouped in the following manner. The integer number of corresponding units between the zero date and the given date is added back to the zero date. That gives us the beginning of the period. The end is found very similarly, only we are adding the number that is one greater than the difference. That produces the beginning of the next period, so we are then subtracting one day, which gives us the correct ending date.

SELECT
  PeriodStart,
  PeriodEnd,
  Count = SUM(Count)
FROM (
  SELECT
    PeriodStart = CASE @Frequency
      WHEN 'day'     THEN Date
      WHEN 'week'    THEN DATEADD(DAY, 1 - DATEPART(WEEKDAY, Date), Date)
      WHEN 'month'   THEN DATEADD(MONTH,   DATEDIFF(MONTH,   0, Date), 0)
      WHEN 'quarter' THEN DATEADD(QUARTER, DATEDIFF(QUARTER, 0, Date), 0)
      WHEN 'year'    THEN DATEADD(YEAR,    DATEDIFF(YEAR,    0, Date), 0)
    END,
    PeriodEnd   = CASE @Frequency
      WHEN 'day'     THEN Date
      WHEN 'week'    THEN DATEADD(DAY, 7 - DATEPART(WEEKDAY, Date), Date)
      WHEN 'month'   THEN DATEADD(DAY, -1, DATEADD(MONTH,   DATEDIFF(MONTH,   0, Date) + 1, 0))
      WHEN 'quarter' THEN DATEADD(DAY, -1, DATEADD(QUARTER, DATEDIFF(QUARTER, 0, Date) + 1, 0))
      WHEN 'year'    THEN DATEADD(DAY, -1, DATEADD(YEAR,    DATEDIFF(YEAR,    0, Date) + 1, 0))
    END,
    Count
  FROM atable
  WHERE Date BETWEEN @DateStart AND @DateEnd
) s
GROUP BY
  PeriodStart,
  PeriodEnd
  • EXEC spReport '1/1/2011', '12/31/2011', 'day':

    PeriodStart PeriodEnd  Count
    ----------- ---------- -----
    2011-11-15  2011-11-15 6
    2011-12-16  2011-12-16 9
    2011-12-17  2011-12-17 2
    2011-12-18  2011-12-18 5
    
  • EXEC spReport '1/1/2011', '12/31/2011', 'week':

    PeriodStart PeriodEnd  Count
    ----------- ---------- -----
    2011-11-13  2011-11-19 6
    2011-12-11  2011-12-17 11
    2011-12-18  2011-12-24 5
    
  • EXEC spReport '1/1/2011', '12/31/2011', 'month':

    PeriodStart PeriodEnd  Count
    ----------- ---------- -----
    2011-11-01  2011-11-30 6
    2011-12-01  2011-12-31 16
    
  • EXEC spReport '1/1/2011', '12/31/2011', 'quarter':

    PeriodStart PeriodEnd  Count
    ----------- ---------- -----
    2011-10-01  2011-12-31 22
    
  • EXEC spReport '1/1/2011', '12/31/2011', 'year':

    PeriodStart PeriodEnd  Count
    ----------- ---------- -----
    2011-01-01  2011-12-31 22
    

Note: From MSDN:

Avoid the use of the sp_ prefix when naming procedures. This prefix is used by SQL Server to designate system procedures. Using the prefix can cause application code to break if there is a system procedure with the same name. For more information, see Designing Stored Procedures (Database Engine).

like image 122
Andriy M Avatar answered Oct 21 '22 05:10

Andriy M