Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Date Range Query - Table Comparison

I have two SQL Server tables containing the following information:

Table t_venues:

venue_id is unique

venue_id  |  start_date  |  end_date
       1  |  01/01/2014  |  02/01/2014
       2  |  05/01/2014  |  05/01/2014
       3  |  09/01/2014  |  15/01/2014
       4  |  20/01/2014  |  30/01/2014

Table t_venueuser:

venue_id is not unique

venue_id  |  start_date  |  end_date
       1  |  02/01/2014  |  02/01/2014
       2  |  05/01/2014  |  05/01/2014
       3  |  09/01/2014  |  10/01/2014
       4  |  23/01/2014  |  25/01/2014

From these two tables I need to find the dates that haven't been selected for each range, so the output would look like this:

venue_id  |  start_date  |  end_date
       1  |  01/01/2014  |  01/01/2014
       3  |  11/01/2014  |  15/01/2014
       4  |  20/01/2014  |  22/01/2014
       4  |  26/01/2014  |  30/01/2014

I can compare the two tables and get the date ranges from t_venues to appear in my query using 'except' but I can't get the query to produce the non-selected dates. Any help would be appreciated.

like image 256
samhankin Avatar asked Sep 30 '22 23:09

samhankin


2 Answers

Calendar Table!

Another perfect candidate for a calendar table. If you can't be bothered to search for one, here's one I made earlier.

Setup Data

DECLARE @t_venues table (
   venue_id   int
 , start_date date
 , end_date   date
);

INSERT INTO @t_venues (venue_id, start_date, end_date)
  VALUES (1, '2014-01-01', '2014-01-02')
       , (2, '2014-01-05', '2014-01-05')
       , (3, '2014-01-09', '2014-01-15')
       , (4, '2014-01-20', '2014-01-30')
;

DECLARE @t_venueuser table (
   venue_id   int
 , start_date date
 , end_date   date
);

INSERT INTO @t_venueuser (venue_id, start_date, end_date)
  VALUES (1, '2014-01-02', '2014-01-02')
       , (2, '2014-01-05', '2014-01-05')
       , (3, '2014-01-09', '2014-01-10')
       , (4, '2014-01-23', '2014-01-25')
;

The Query

SELECT t_venues.venue_id
     , calendar.the_date
     , CASE WHEN t_venueuser.venue_id IS NULL THEN 1 ELSE 0 END As is_available
FROM   dbo.calendar /* see: http://gvee.co.uk/files/sql/dbo.numbers%20&%20dbo.calendar.sql for an example */
 INNER
  JOIN @t_venues As t_venues
    ON t_venues.start_date <= calendar.the_date
   AND t_venues.end_date   >= calendar.the_date
 LEFT
  JOIN @t_venueuser As t_venueuser
    ON t_venueuser.venue_id = t_venues.venue_id
   AND t_venueuser.start_date <= calendar.the_date
   AND t_venueuser.end_date   >= calendar.the_date
ORDER
    BY t_venues.venue_id
     , calendar.the_date
;

The Result

venue_id    the_date                is_available
----------- ----------------------- ------------
1           2014-01-01 00:00:00.000 1
1           2014-01-02 00:00:00.000 0
2           2014-01-05 00:00:00.000 0
3           2014-01-09 00:00:00.000 0
3           2014-01-10 00:00:00.000 0
3           2014-01-11 00:00:00.000 1
3           2014-01-12 00:00:00.000 1
3           2014-01-13 00:00:00.000 1
3           2014-01-14 00:00:00.000 1
3           2014-01-15 00:00:00.000 1
4           2014-01-20 00:00:00.000 1
4           2014-01-21 00:00:00.000 1
4           2014-01-22 00:00:00.000 1
4           2014-01-23 00:00:00.000 0
4           2014-01-24 00:00:00.000 0
4           2014-01-25 00:00:00.000 0
4           2014-01-26 00:00:00.000 1
4           2014-01-27 00:00:00.000 1
4           2014-01-28 00:00:00.000 1
4           2014-01-29 00:00:00.000 1
4           2014-01-30 00:00:00.000 1

(21 row(s) affected)

The Explanation

Our calendar tables contains an entry for every date.

We join our t_venues (as an aside, if you have the choice, lose the t_ prefix!) to return every day between our start_date and end_date. Example output for venue_id=4 for just this join:

venue_id    the_date
----------- -----------------------
4           2014-01-20 00:00:00.000
4           2014-01-21 00:00:00.000
4           2014-01-22 00:00:00.000
4           2014-01-23 00:00:00.000
4           2014-01-24 00:00:00.000
4           2014-01-25 00:00:00.000
4           2014-01-26 00:00:00.000
4           2014-01-27 00:00:00.000
4           2014-01-28 00:00:00.000
4           2014-01-29 00:00:00.000
4           2014-01-30 00:00:00.000

(11 row(s) affected)

Now we have one row per day, we [outer] join our t_venueuser table. We join this in much the same manner as before, but with one added twist: we need to join based on the venue_id too!

Running this for venue_id=4 gives this result:

venue_id    the_date                t_venueuser_venue_id
----------- ----------------------- --------------------
4           2014-01-20 00:00:00.000 NULL
4           2014-01-21 00:00:00.000 NULL
4           2014-01-22 00:00:00.000 NULL
4           2014-01-23 00:00:00.000 4
4           2014-01-24 00:00:00.000 4
4           2014-01-25 00:00:00.000 4
4           2014-01-26 00:00:00.000 NULL
4           2014-01-27 00:00:00.000 NULL
4           2014-01-28 00:00:00.000 NULL
4           2014-01-29 00:00:00.000 NULL
4           2014-01-30 00:00:00.000 NULL

(11 row(s) affected)

See how we have a NULL value for rows where there is no t_venueuser record. Genius, no? ;-)

So in my first query I gave you a quick CASE statement that shows availability (1=available, 0=not available). This is for illustration only, but could be useful to you.

You can then either wrap the query up and then apply an extra filter on this calculated column or simply add a where clause in: WHERE t_venueuser.venue_id IS NULL and that will do the same trick.

like image 172
gvee Avatar answered Oct 03 '22 02:10

gvee


This is a complete hack, but it gives the results you require, I've only tested it on the data you provided so there may well be gotchas with larger sets.

In general what you are looking at solving here is a variation of gaps and islands problem ,this is (briefly) a sequence where some items are missing. The missing items are referred as gaps and the existing items are referred as islands. If you would like to understand this issue in general check a few of the articles:

  • Simple talk article
  • blogs.MSDN article
  • SO answers tagged gaps-and-islands

Code:

;with dates as
(
    SELECT  vdates.venue_id,    
            vdates.vdate
    FROM  ( SELECT DATEADD(d,sv.number,v.start_date) vdate
                 , v.venue_id
            FROM t_venues v
            INNER JOIN master..spt_values sv 
                ON sv.type='P'
               AND sv.number BETWEEN 0 AND datediff(d, v.start_date, v.end_date)) vdates
    LEFT JOIN t_venueuser vu
        ON vdates.vdate >= vu.start_date
       AND vdates.vdate <= vu.end_date
       AND vdates.venue_id = vu.venue_id
    WHERE ISNULL(vu.venue_id,-1) = -1
)
SELECT venue_id, ISNULL([1],[2]) StartDate, [2] EndDate
FROM   (SELECT venue_id, rDate, ROW_NUMBER() OVER (PARTITION BY venue_id, DateType ORDER BY rDate) AS rType, DateType as dType
        FROM(   SELECT d1.venue_id
                      ,d1.vdate AS rDate
                      ,'1' AS DateType
                FROM dates AS d1    
                LEFT JOIN dates AS d0
                    ON DATEADD(d,-1,d1.vdate) = d0.vdate
                LEFT JOIN dates AS d2       
                    ON DATEADD(d,1,d1.vdate) = d2.vdate
                WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 1
                AND ISNULL(d0.vdate, '01 Jan 1753') = '01 Jan 1753'
                UNION 
                SELECT d1.venue_id
                      ,ISNULL(d2.vdate,d1.vdate)
                      ,'2'
                FROM dates AS d1    
                LEFT JOIN dates AS d2       
                    ON DATEADD(d,1,d1.vdate) = d2.vdate
                WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 2
            ) res
        ) src
PIVOT   (MIN (rDate)
        FOR dType IN
        ( [1], [2] )
        ) AS pvt

Results:

venue_id    StartDate   EndDate
1           2014-01-01  2014-01-01
3           2014-01-11  2014-01-15
4           2014-01-20  2014-01-22
4           2014-01-26  2014-01-30
like image 29
Mack Avatar answered Oct 03 '22 01:10

Mack