I have a table with (among other things) dates in a field.
I need to get a list of all dates that are more recent than the oldest date, older than the most recent date, and are completely missing from the table.
So, if the table contained:
2012-01-02
2012-01-02
2012-01-03
2012-01-05
2012-01-05
2012-01-07
2012-01-08
I want a query that returns:
2012-01-04
2012-01-06
Something like this (assuming your table is named your_table
and the date column is named the_date
):
with date_range as (
select min(the_date) as oldest,
max(the_date) as recent,
max(the_date) - min(the_date) as total_days
from your_table
),
all_dates as (
select oldest + level - 1 as a_date
from date_range
connect by level <= (select total_days from date_range)
)
select ad.a_date
from all_dates ad
left join your_table yt on ad.a_date = yt.the_date
where yt.the_date is null
order by ad.a_date;
Edit:
the WITH
clause is called a "common table expression" and is equivalent to a derived table ("inline view").
It's similar to
select *
from (
.....
) all_dates
join your_table ...
The second CTE simply creates a list of dates "on-the-fly" using a undocumented feature of Oracle's connect by
implementation.
Re-using a select (like I did with calculating the first and last date) is a bit easier (and IMHO more readable) than using derived tables.
Edit 2:
This can be done with a recursive CTE as well:
with date_range as (
select min(the_date) as oldest,
max(the_date) as recent,
max(the_date) - min(the_date) as total_days
from your_table
),
all_dates (a_date, lvl) as (
select oldest as a_date, 1 as lvl
from date_range
union all
select (select oldest from date_range) + lvl, lvl + 1
from all_dates
where lvl < (select total_days from date_range)
)
select ad.a_date, lvl
from all_dates ad
left join your_table yt on ad.a_date = yt.the_date
where yt.the_date is null
order by ad.a_date;
Which should work in all DBMS supporting recursive CTEs (PostgreSQL and Firebird - being more standard compliant - do need the recursive
keyword though).
Note the hack select (select oldest from date_range) + lvl, lvl + 1
in the recursive part. This should not be necessary, but Oracle still has some bugs with regards to DATEs in a recursive CTE. In PostgreSQL the following works without problems:
....
all_dates (a_date, lvl) as (
select oldest as a_date, 0 as lvl
from date_range
union all
select a_date + 1, lvl + 1
from all_dates
where lvl < (select total_days from date_range)
)
....
I'd opt for this variant, because it's more efficient:
with all_dates_wo_boundary_values as
( select oldest + level the_date
from ( select min(the_date) oldest
, max(the_date) recent
from your_table
)
connect by level <= recent - oldest - 1
)
select the_date
from all_dates_wo_boundary_values
minus
select the_date
from your_table
And here is some proof.
First the setup:
SQL> create table your_table (the_date)
2 as
3 select date '2012-01-02' from dual union all
4 select date '2012-01-02' from dual union all
5 select date '2012-01-03' from dual union all
6 select date '2012-01-05' from dual union all
7 select date '2012-01-05' from dual union all
8 select date '2012-01-07' from dual union all
9 select date '2012-01-08' from dual
10 /
Table created.
SQL> exec dbms_stats.gather_table_stats(user,'your_table')
PL/SQL procedure successfully completed.
SQL> alter session set statistics_level = all
2 /
Session altered.
Horse's query:
SQL> with date_range as
2 ( select min(the_date) as oldest
3 , max(the_date) as recent
4 , max(the_date) - min(the_date) as total_days
5 from your_table
6 )
7 , all_dates as
8 ( select ( select oldest from date_range) + level as a_date
9 from dual
10 connect by level <= (select total_days from date_range)
11 )
12 select ad.a_date
13 from all_dates ad
14 left join your_table yt on ad.a_date = yt.the_date
15 where yt.the_date is null
16 order by ad.a_date
17 /
A_DATE
-------------------
04-01-2012 00:00:00
06-01-2012 00:00:00
2 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID gaqx49vb9gz9k, child number 0
-------------------------------------
with date_range as ( select min(the_date) as oldest , max(the_date) as recent , max(the_date) - min(the_date) as total_d
ays from your_table )
, all_dates as ( select ( select oldest from date_range) + level as a_date from dual connect by level <= (select total_days from
date_range) ) select
ad.a_date from all_dates ad left join your_table yt on ad.a_date = yt.the_date where yt.the_date is null order by ad.a_date
Plan hash value: 1419150012
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | 2 |00:00:00.01 | 22 | 1 | 1 | | | |
| 2 | LOAD AS SELECT | | 1 | | 1 |00:00:00.01 | 7 | 0 | 1 | 262K| 262K| 262K (0)|
| 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 0 | | | |
| 4 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | 0 | 0 | | | |
| 5 | SORT ORDER BY | | 1 | 1 | 2 |00:00:00.01 | 12 | 1 | 0 | 2048 | 2048 | 2048 (0)|
|* 6 | FILTER | | 1 | | 2 |00:00:00.01 | 12 | 1 | 0 | | | |
|* 7 | HASH JOIN OUTER | | 1 | 1 | 7 |00:00:00.01 | 12 | 1 | 0 | 1048K| 1048K| 707K (0)|
| 8 | VIEW | | 1 | 1 | 6 |00:00:00.01 | 9 | 1 | 0 | | | |
| 9 | CONNECT BY WITHOUT FILTERING| | 1 | | 6 |00:00:00.01 | 3 | 0 | 0 | | | |
| 10 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | 0 | 0 | | | |
| 11 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 0 | | | |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D660C_81240964 | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 0 | | | |
| 13 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | 0 | 0 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("YT"."THE_DATE" IS NULL)
7 - access("YT"."THE_DATE"=INTERNAL_FUNCTION("AD"."A_DATE"))
32 rows selected.
And my suggestion:
SQL> with all_dates_wo_boundary_values as
2 ( select oldest + level the_date
3 from ( select min(the_date) oldest
4 , max(the_date) recent
5 from your_table
6 )
7 connect by level <= recent - oldest - 1
8 )
9 select the_date
10 from all_dates_wo_boundary_values
11 minus
12 select the_date
13 from your_table
14 /
THE_DATE
-------------------
04-01-2012 00:00:00
06-01-2012 00:00:00
2 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 7aavxmzkj7zq7, child number 0
-------------------------------------
with all_dates_wo_boundary_values as ( select oldest + level the_date from ( select min(the_date) oldest
, max(the_date) recent from your_table ) connect by level <= recent - oldest - 1 ) select
the_date from all_dates_wo_boundary_values minus select the_date from your_table
Plan hash value: 2293301832
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 1 | MINUS | | 1 | | 2 |00:00:00.01 | 6 | | | |
| 2 | SORT UNIQUE | | 1 | 1 | 5 |00:00:00.01 | 3 | 9216 | 9216 | 8192 (0)|
| 3 | VIEW | | 1 | 1 | 5 |00:00:00.01 | 3 | | | |
| 4 | CONNECT BY WITHOUT FILTERING| | 1 | | 5 |00:00:00.01 | 3 | | | |
| 5 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 6 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | | | |
| 8 | SORT UNIQUE | | 1 | 7 | 5 |00:00:00.01 | 3 | 9216 | 9216 | 8192 (0)|
| 9 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
22 rows selected.
Regards,
Rob.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With