I have a problem with SQL. I have the following table:
declare @t table (START_DATE datetime,
END_DATE datetime,
GROSS_SALES_PRICE decimal(10,2)
);
insert into @t
values ('2014-08-06 00:00:00.000', '2014-10-06 23:59:59.000', 29.99),
('2014-09-06 00:00:00.000', '2014-09-09 23:59:59.000', 32.99),
('2014-09-10 00:00:00.000', '2014-09-30 23:59:59.000', 32.99),
('2014-10-07 00:00:00.000', '2049-12-31 23:59:59.000', 34.99)
I would like to separate the dates which overlaps. For example I have in the first row START_DATE 2014-08-06 and END_DATE 2014-10-06. We can see that the dates from the second and the third row are inside this period of time from first row.
So I would like to separate them as follows:
declare @t2 table (START_DATE datetime,
END_DATE datetime,
GROSS_SALES_PRICE decimal(10,2)
);
insert into @t2
values ('2014-08-06 00:00:00.000', '2014-09-05 23:59:59.000', 29.99),
('2014-09-06 00:00:00.000', '2014-09-09 23:59:59.000', 32.99),
('2014-09-10 00:00:00.000', '2014-09-30 23:59:59.000', 32.99),
('2014-10-01 00:00:00.000', '2014-10-06 23:59:59.000', 29.99),
('2014-10-07 00:00:00.000', '2049-12-31 23:59:59.000', 34.99)
So the second and the third rows remained unchanged. The first row should have new END_DATE. We also have new row. The GROSS_SALES_PRICE should remain as it is in internal period. Thanks for help. I am using SQL Server 2014
A calendar/dates table can simplify this, but we can also use a query to generate a temporary dates table using a common table expression.
From there, we can solve this as a gaps and islands style problem. Using the dates table and using outer apply()
to get the latest values for start_date
and gross_sales_price
we can identify the groups we want to re-aggregate by using two row_number()
s. The first just ordered by date
, less the other that is partitioned by the value we have as the latest start_date
and ordered by date
.
Then you can dump the results of the common table expression src
to a temporary table and do your inserts/deletes using that or you can use merge
using src
.
/* -- dates --*/
declare @fromdate datetime, @thrudate datetime;
select @fromdate = min(start_date), @thrudate = max(end_date) from #t;
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (datediff(day, @fromdate, @thrudate)+1)
[Date]=convert(datetime,dateadd(day,row_number() over(order by (select 1))-1,@fromdate))
, [End_Date]=convert(datetime,dateadd(millisecond,-3,dateadd(day,row_number() over(order by (select 1)),@fromdate)))
from n as deka cross join n as hecto cross join n as kilo
cross join n as tenK cross join n as hundredK
order by [Date]
)
/* -- islands -- */
, cte as (
select
start_date = d.date
, end_date = d.end_date
, x.gross_sales_price
, grp = row_number() over (order by d.date)
- row_number() over (partition by x.start_date order by d.date)
from dates d
outer apply (
select top 1 l.start_date, l.gross_sales_price
from #t l
where d.date >= l.start_date
and d.date <= l.end_date
order by l.start_date desc
) x
)
/* -- aggregated islands -- */
, src as (
select
start_date = min(start_date)
, end_date = max(end_date)
, gross_sales_price
from cte
group by gross_sales_price, grp
)
/* -- merge -- */
merge #t with (holdlock) as target
using src as source
on target.start_date = source.start_date
and target.end_date = source.end_date
and target.gross_sales_price = source.gross_sales_price
when not matched by target
then insert (start_date, end_date, gross_sales_price)
values (start_date, end_date, gross_sales_price)
when not matched by source
then delete
output $action, inserted.*, deleted.*;
/* -- results -- */
select
start_date
, end_date
, gross_sales_price
from #t
order by start_date
rextester demo: http://rextester.com/MFXCQQ90933
merge
output (you do not need to output this, just showing for the demo):
+---------+---------------------+---------------------+-------------------+---------------------+---------------------+-------------------+
| $action | START_DATE | END_DATE | GROSS_SALES_PRICE | START_DATE | END_DATE | GROSS_SALES_PRICE |
+---------+---------------------+---------------------+-------------------+---------------------+---------------------+-------------------+
| INSERT | 2014-10-01 00:00:00 | 2014-10-06 23:59:59 | 29.99 | NULL | NULL | NULL |
| INSERT | 2014-08-06 00:00:00 | 2014-09-05 23:59:59 | 29.99 | NULL | NULL | NULL |
| DELETE | NULL | NULL | NULL | 2014-08-06 00:00:00 | 2014-10-06 23:59:59 | 29.99 |
+---------+---------------------+---------------------+-------------------+---------------------+---------------------+-------------------+
results:
+-------------------------+-------------------------+-------------------+
| start_date | end_date | gross_sales_price |
+-------------------------+-------------------------+-------------------+
| 2014-08-06 00:00:00.000 | 2014-09-05 23:59:59.997 | 29.99 |
| 2014-09-06 00:00:00.000 | 2014-09-09 23:59:59.997 | 32.99 |
| 2014-09-10 00:00:00.000 | 2014-09-30 23:59:59.997 | 32.99 |
| 2014-10-01 00:00:00.000 | 2014-10-06 23:59:59.997 | 29.99 |
| 2014-10-07 00:00:00.000 | 2049-12-31 23:59:59.997 | 34.99 |
+-------------------------+-------------------------+-------------------+
calendar and numbers tables reference:
merge
reference:
MERGE
Statement - Aaron BertrandMerge
- Dan GuzmanMERGE
Bug - Paul Whitemerge
statement - Aaron BertrandMERGE
, please read this! - Aaron BertrandMerge
Statement (LCK_M_RS_U locks) - Kendra Littlemerge
statements the right way - David SteinIn addition to using datetime2
type instead of datetime
, I'd recommend you to use [Closed; Open)
intervals instead of [Closed; Closed]
. In other words, use 2014-08-06 00:00:00.000, 2014-09-06 00:00:00.000
instead of 2014-08-06 00:00:00.000, 2014-09-05 23:59:59.000
. Specifically, because 59.999
will be rounded to 00.000
for the datetime
type, but will not for datetime2(3)
. You don't want to depend on such internal details of the data types.
Also, [Closed; Open)
intervals are much easier to deal with in the queries as you'll see below.
The main idea is to put all start and end dates (boundaries) together in one list with a flag that indicates whether it is a beginning or end of the interval. When a running total of the flag turns into zero, it means that all overlapping intervals have ended.
Sample data
I extended your sample data with several cases of overlapping intervals.
declare @t table
(START_DATE datetime2(0),
END_DATE datetime2(0),
GROSS_SALES_PRICE decimal(10,2)
);
insert into @t
values
-- |------| 11
('2001-01-01 00:00:00', '2001-01-10 00:00:00', 11),
-- |------| 10
-- |------| 20
('2010-01-01 00:00:00', '2010-01-10 00:00:00', 10),
('2010-01-05 00:00:00', '2010-01-20 00:00:00', 20),
-- |----------| 30
-- |------| 40
('2010-02-01 00:00:00', '2010-02-20 00:00:00', 30),
('2010-02-05 00:00:00', '2010-02-20 00:00:00', 40),
-- |----------| 50
-- |----------| 60
('2010-03-01 00:00:00', '2010-03-20 00:00:00', 50),
('2010-03-01 00:00:00', '2010-03-20 00:00:00', 60),
-- |----------| 70
-- |------| 80
('2010-04-01 00:00:00', '2010-04-20 00:00:00', 70),
('2010-04-05 00:00:00', '2010-04-15 00:00:00', 80),
-- |-----------------------------| 29.99
-- |---------| 32.99
-- |---------| 32.99
-- |----------| 34.99
('2014-08-06 00:00:00', '2014-10-07 00:00:00', 29.99),
('2014-09-06 00:00:00', '2014-09-10 00:00:00', 32.99),
('2014-09-10 00:00:00', '2014-10-01 00:00:00', 32.99),
('2014-10-07 00:00:00', '2050-01-01 00:00:00', 34.99);
Query
WITH
CTE_Boundaries
AS
(
SELECT
START_DATE AS dt
,+1 AS Flag
,GROSS_SALES_PRICE AS Price
FROM @T
UNION ALL
SELECT
END_DATE AS dt
,-1 AS Flag
,GROSS_SALES_PRICE AS Price
FROM @T
)
,CTE_Intervals
AS
(
SELECT
dt
,Flag
,Price
,SUM(Flag) OVER (ORDER BY dt, Flag ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SumFlag
,LEAD(dt) OVER (ORDER BY dt, Flag) AS NextDate
,LEAD(Price) OVER (ORDER BY dt, Flag) AS NextPrice
FROM CTE_Boundaries
)
SELECT
dt AS StartDate
,NextDate AS EndDate
,CASE WHEN Flag = 1 THEN Price ELSE NextPrice END AS Price
FROM CTE_Intervals
WHERE
SumFlag > 0
AND dt <> NextDate
ORDER BY StartDate
;
Result
+---------------------+---------------------+-------+
| StartDate | EndDate | Price |
+---------------------+---------------------+-------+
| 2001-01-01 00:00:00 | 2001-01-10 00:00:00 | 11.00 |
| 2010-01-01 00:00:00 | 2010-01-05 00:00:00 | 10.00 |
| 2010-01-05 00:00:00 | 2010-01-10 00:00:00 | 20.00 |
| 2010-01-10 00:00:00 | 2010-01-20 00:00:00 | 20.00 |
| 2010-02-01 00:00:00 | 2010-02-05 00:00:00 | 30.00 |
| 2010-02-05 00:00:00 | 2010-02-20 00:00:00 | 40.00 |
| 2010-03-01 00:00:00 | 2010-03-20 00:00:00 | 60.00 |
| 2010-04-01 00:00:00 | 2010-04-05 00:00:00 | 70.00 |
| 2010-04-05 00:00:00 | 2010-04-15 00:00:00 | 80.00 |
| 2010-04-15 00:00:00 | 2010-04-20 00:00:00 | 70.00 |
this is your sample data:
| 2014-08-06 00:00:00 | 2014-09-06 00:00:00 | 29.99 |
| 2014-09-06 00:00:00 | 2014-09-10 00:00:00 | 32.99 |
| 2014-09-10 00:00:00 | 2014-10-01 00:00:00 | 32.99 |
| 2014-10-01 00:00:00 | 2014-10-07 00:00:00 | 29.99 |
| 2014-10-07 00:00:00 | 2050-01-01 00:00:00 | 34.99 |
+---------------------+---------------------+-------+
Intermediary result of CTE_Intervals
Examine these to understand how the query works
+---------------------+------+-------+---------+---------------------+-----------+
| dt | Flag | Price | SumFlag | NextDate | NextPrice |
+---------------------+------+-------+---------+---------------------+-----------+
| 2001-01-01 00:00:00 | 1 | 11.00 | 1 | 2001-01-10 00:00:00 | 11.00 |
| 2001-01-10 00:00:00 | -1 | 11.00 | 0 | 2010-01-01 00:00:00 | 10.00 |
| 2010-01-01 00:00:00 | 1 | 10.00 | 1 | 2010-01-05 00:00:00 | 20.00 |
| 2010-01-05 00:00:00 | 1 | 20.00 | 2 | 2010-01-10 00:00:00 | 10.00 |
| 2010-01-10 00:00:00 | -1 | 10.00 | 1 | 2010-01-20 00:00:00 | 20.00 |
| 2010-01-20 00:00:00 | -1 | 20.00 | 0 | 2010-02-01 00:00:00 | 30.00 |
| 2010-02-01 00:00:00 | 1 | 30.00 | 1 | 2010-02-05 00:00:00 | 40.00 |
| 2010-02-05 00:00:00 | 1 | 40.00 | 2 | 2010-02-20 00:00:00 | 30.00 |
| 2010-02-20 00:00:00 | -1 | 30.00 | 1 | 2010-02-20 00:00:00 | 40.00 |
| 2010-02-20 00:00:00 | -1 | 40.00 | 0 | 2010-03-01 00:00:00 | 50.00 |
| 2010-03-01 00:00:00 | 1 | 50.00 | 1 | 2010-03-01 00:00:00 | 60.00 |
| 2010-03-01 00:00:00 | 1 | 60.00 | 2 | 2010-03-20 00:00:00 | 50.00 |
| 2010-03-20 00:00:00 | -1 | 50.00 | 1 | 2010-03-20 00:00:00 | 60.00 |
| 2010-03-20 00:00:00 | -1 | 60.00 | 0 | 2010-04-01 00:00:00 | 70.00 |
| 2010-04-01 00:00:00 | 1 | 70.00 | 1 | 2010-04-05 00:00:00 | 80.00 |
| 2010-04-05 00:00:00 | 1 | 80.00 | 2 | 2010-04-15 00:00:00 | 80.00 |
| 2010-04-15 00:00:00 | -1 | 80.00 | 1 | 2010-04-20 00:00:00 | 70.00 |
| 2010-04-20 00:00:00 | -1 | 70.00 | 0 | 2014-08-06 00:00:00 | 29.99 |
| 2014-08-06 00:00:00 | 1 | 29.99 | 1 | 2014-09-06 00:00:00 | 32.99 |
| 2014-09-06 00:00:00 | 1 | 32.99 | 2 | 2014-09-10 00:00:00 | 32.99 |
| 2014-09-10 00:00:00 | -1 | 32.99 | 1 | 2014-09-10 00:00:00 | 32.99 |
| 2014-09-10 00:00:00 | 1 | 32.99 | 2 | 2014-10-01 00:00:00 | 32.99 |
| 2014-10-01 00:00:00 | -1 | 32.99 | 1 | 2014-10-07 00:00:00 | 29.99 |
| 2014-10-07 00:00:00 | -1 | 29.99 | 0 | 2014-10-07 00:00:00 | 34.99 |
| 2014-10-07 00:00:00 | 1 | 34.99 | 1 | 2050-01-01 00:00:00 | 34.99 |
| 2050-01-01 00:00:00 | -1 | 34.99 | 0 | NULL | NULL |
+---------------------+------+-------+---------+---------------------+-----------+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With