I want to know how to use loops to fill in missing dates with value zero based on the start/end dates by groups in sql so that i have consecutive time series in each group. I have two questions.
My input and expected output are listed as below.
Input: I have a table A like
date value grp_no 8/06/12 1 1 8/08/12 1 1 8/09/12 0 1 8/07/12 2 2 8/08/12 1 2 8/12/12 3 2
Also I have a table B which can be used to left join with A to fill in missing dates.
date ... 8/05/12 8/06/12 8/07/12 8/08/12 8/09/12 8/10/12 8/11/12 8/12/12 8/13/12 ...
How can I use A and B to generate the following output in sql?
Output:
date value grp_no 8/06/12 1 1 8/07/12 0 1 8/08/12 1 1 8/09/12 0 1 8/07/12 2 2 8/08/12 1 2 8/09/12 0 2 8/10/12 0 2 8/11/12 0 2 8/12/12 3 2
Please send me your code and suggestion. Thank you so much in advance!!!
You can do it like this without loops
SELECT p.date, COALESCE(a.value, 0) value, p.grp_no FROM ( SELECT grp_no, date FROM ( SELECT grp_no, MIN(date) min_date, MAX(date) max_date FROM tableA GROUP BY grp_no ) q CROSS JOIN tableb b WHERE b.date BETWEEN q.min_date AND q.max_date ) p LEFT JOIN TableA a ON p.grp_no = a.grp_no AND p.date = a.date
The innermost subquery grabs min and max dates per group. Then cross join with TableB
produces all possible dates within the min-max range per group. And finally outer select uses outer join with TableA
and fills value
column with 0
for dates that are missing in TableA
.
Output:
| DATE | VALUE | GRP_NO | |------------|-------|--------| | 2012-08-06 | 1 | 1 | | 2012-08-07 | 0 | 1 | | 2012-08-08 | 1 | 1 | | 2012-08-09 | 0 | 1 | | 2012-08-07 | 2 | 2 | | 2012-08-08 | 1 | 2 | | 2012-08-09 | 0 | 2 | | 2012-08-10 | 0 | 2 | | 2012-08-11 | 0 | 2 | | 2012-08-12 | 3 | 2 |
Here is SQLFiddle demo
The following query does a union
with tableA
and tableB
. It then uses group by to merge the rows from tableA
and tableB
so that all of the dates from tableB
are in the result. If a date is not in tableA
, then the row has 0 for value
and grp_no
. Otherwise, the row has the actual values for value
and grp_no
.
select dat, sum(val), sum(grp) from ( select date as dat, value as val, grp_no as grp from tableA union select date, 0, 0 from tableB where date >= date '2012-08-06' and date <= date '2012-08-13' ) group by dat order by dat
I find this query to be easier for me to understand. It also runs faster. It takes 16 seconds whereas a similar right join
query takes 32 seconds.
This solution only works with numerical data.
This solution assumes a fixed date range. With some extra work this query can be adapted to limit the date range to what is found in tableA
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With