Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fill missing dates by groups in a table in sql

Tags:

I want to know how to use loops to fill in missing dates with value zero based on the start/end dates by groups in sql so that i have consecutive time series in each group. I have two questions.

  1. how to loop for each group?
  2. How to use start/end dates for each group to dynamically fill in missing dates?

My input and expected output are listed as below.

Input: I have a table A like

date     value      grp_no 8/06/12    1         1 8/08/12    1         1 8/09/12    0         1 8/07/12    2         2 8/08/12    1         2 8/12/12    3         2 

Also I have a table B which can be used to left join with A to fill in missing dates.

date ... 8/05/12 8/06/12 8/07/12 8/08/12 8/09/12 8/10/12 8/11/12 8/12/12 8/13/12 ... 

How can I use A and B to generate the following output in sql?

Output:

date     value      grp_no 8/06/12    1         1   8/07/12    0         1 8/08/12    1         1 8/09/12    0         1 8/07/12    2         2 8/08/12    1         2 8/09/12    0         2 8/10/12    0         2 8/11/12    0         2 8/12/12    3         2 

Please send me your code and suggestion. Thank you so much in advance!!!

like image 395
user2824423 Avatar asked Sep 29 '13 06:09

user2824423


2 Answers

You can do it like this without loops

SELECT p.date, COALESCE(a.value, 0) value, p.grp_no   FROM (   SELECT grp_no, date     FROM   (     SELECT grp_no, MIN(date) min_date, MAX(date) max_date       FROM tableA      GROUP BY grp_no   ) q CROSS JOIN tableb b     WHERE b.date BETWEEN q.min_date AND q.max_date ) p LEFT JOIN TableA a     ON p.grp_no = a.grp_no     AND p.date = a.date 

The innermost subquery grabs min and max dates per group. Then cross join with TableB produces all possible dates within the min-max range per group. And finally outer select uses outer join with TableA and fills value column with 0 for dates that are missing in TableA.

Output:

 |       DATE | VALUE | GRP_NO | |------------|-------|--------| | 2012-08-06 |     1 |      1 | | 2012-08-07 |     0 |      1 | | 2012-08-08 |     1 |      1 | | 2012-08-09 |     0 |      1 | | 2012-08-07 |     2 |      2 | | 2012-08-08 |     1 |      2 | | 2012-08-09 |     0 |      2 | | 2012-08-10 |     0 |      2 | | 2012-08-11 |     0 |      2 | | 2012-08-12 |     3 |      2 | 

Here is SQLFiddle demo

like image 150
peterm Avatar answered Oct 13 '22 18:10

peterm


The following query does a union with tableA and tableB. It then uses group by to merge the rows from tableA and tableB so that all of the dates from tableB are in the result. If a date is not in tableA, then the row has 0 for value and grp_no. Otherwise, the row has the actual values for value and grp_no.

select    dat,    sum(val),    sum(grp) from    (       select          date as dat,          value as val,          grp_no as grp       from          tableA    union       select          date,          0,          0       from          tableB       where          date >= date '2012-08-06' and          date <= date '2012-08-13'    ) group by    dat order by    dat 

I find this query to be easier for me to understand. It also runs faster. It takes 16 seconds whereas a similar right join query takes 32 seconds.

This solution only works with numerical data.

This solution assumes a fixed date range. With some extra work this query can be adapted to limit the date range to what is found in tableA.

like image 23
Nathan Avatar answered Oct 13 '22 20:10

Nathan