I have the following rows:
CREATE TABLE #TEMP (id int, name varchar(255), startdate datetime, enddate datetime)
INSERT INTO #TEMP VALUES(1, 'John', '2011-01-11 00:00:00.000','2011-01-11 00:01:10.000')
INSERT INTO #TEMP VALUES(2, 'John', '2011-01-11 00:00:20.000','2011-01-11 00:01:05.000')
INSERT INTO #TEMP VALUES(3, 'John', '2011-01-11 00:01:40.000','2011-01-11 00:01:50.000')
INSERT INTO #TEMP VALUES(4, 'Adam', '2011-01-11 00:00:40.000','2011-01-11 00:01:20.000')
INSERT INTO #TEMP VALUES(5, 'Adam', '2011-01-11 00:00:45.000','2011-01-11 00:01:15.000')
SELECT * FROM #TEMP
DROP TABLE #TEMP
I am trying to remove records that have dates contained within other dates to obtain the following:
John 2011-01-11 00:00:00.000 2011-01-11 00:01:10.000
John 2011-01-11 00:01:40.000 2011-01-11 00:01:50.000
Adam 2011-01-11 00:00:40.000 2011-01-11 00:01:20.000
Any suggestions on how to achieve this for a table of about 100K rows?
This gives the desired result:
DELETE T1 FROM #TEMP T1
WHERE EXISTS(
SELECT NULL FROM #TEMP T2
WHERE t1.id <> t2.id
AND t1.name = t2.name
AND t1.startdate >= t1.startdate
AND t1.enddate <= t1.enddate
)
http://msdn.microsoft.com/en-us/library/ms188336.aspx
Edit: I've just noticed that there's one problem. If there are duplicates (same start- and enddate), both would be deleted (none with John's approach, even with only one equal date). So you need to take this into account:
DELETE T1 FROM #TEMP T1
WHERE EXISTS(
SELECT NULL FROM #TEMP T2
WHERE t1.id <> t2.id
AND t1.name = t2.name
AND t1.startdate > t2.startdate
AND t1.enddate < t2.enddate
OR t1.id <> t2.id
AND t1.name = t2.name
AND t1.startdate = t2.startdate
AND t1.enddate < t2.enddate
OR t1.id <> t2.id
AND t1.name = t2.name
AND t1.startdate > t2.startdate
AND t1.enddate = t2.enddate
OR t1.id > t2.id
AND t1.name = t2.name
AND t1.startdate = t2.startdate
AND t1.enddate = t2.enddate
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With