I have a table that has records in this format:
SPECIAL_ID | OTHER_ID | NAME | TIMESTAMP
I need to create a new table based off of this, but only get records that are at least 25 minutes apart from the previous one. The intervals are not consistent. So, if I start with record 1, I need to get the very next record that is at least 25 minutes from record 1's timestamp. This will be record 2 in the new table. And then the next record that is 25 minutes after the previous one that was extracted. So, here is an example of info:
AAA | 1 | WHATEVER2 | 2016-11-20 00:00:00
BCD | 2 | WHATEVER00 | 2016-11-20 00:02:00
AAA | 3 | WHATEVER01 | 2016-11-20 00:09:00
AAA | 4 | WHATEVER55 | 2016-11-20 00:20:00
XYZ | 5 | WHATEVER | 2016-11-20 00:24:00
AAA | 6 | WHATEVER11 | 2016-11-20 00:45:00
QRS | 7 | WHATEVER | 2016-11-20 00:46:00
QRS | 8 | WHATEVER12 | 2016-11-20 00:59:00
AAA | 9 | WHATEVER12 | 2016-11-20 01:02:00
AAA |10 | WHATEVER12 | 2016-11-20 01:17:00
And what I am trying to do:
AAA | 1 | WHATEVER2 | 2016-11-20 00:00:00
AAA | 6 | WHATEVER11 | 2016-11-20 00:45:00
AAA |10 | WHATEVER12 | 2016-11-20 01:17:00
I managed to use a cursor and tested it on a small set of records. It worked....but I have millions of records that need to be analyzed in this way. A cursor is just asking for trouble, it seems.
Is there a better way to do this?
I am stuck with SQL server 2008, so lead() and lag() is out of the question.
Any help greatly appreciated.
This could be a possible (computationally expensive) solution for Sql Server 2008:
In the first step, the first records are determined. At the second step, the query determines the nearest 25-minutes records for each existing record. Subsequently (step3), the data records are reduced to the one with the lowest OTHER_ID. Of course, this only works if the field with the OTHER_ID unique and increases synchronously with the time. For millions of records, the fields used in the query should be indexed and the search restricted.
-- test script
SET dateformat ymd
;WITH testdata AS (
SELECT 'AAA' AS SPECIAL_ID,
1 AS OTHER_ID,
'WHATEVER2' AS NAME ,
CONVERT(DATETIME, '2016-11-20 00:00:00') AS [TIMESTAMP]
UNION SELECT 'BCD' , 2 , 'WHATEVER00' , CONVERT(DATETIME, '2016-11-20 00:02:00')
UNION SELECT 'AAA' , 3 , 'WHATEVER01' , CONVERT(DATETIME, '2016-11-20 00:02:01')
UNION SELECT 'AAA' , 4 , 'WHATEVER55' , CONVERT(DATETIME, '2016-11-20 00:20:00')
UNION SELECT 'XYZ' , 5 , 'WHATEVER' , CONVERT(DATETIME, '2016-11-20 00:24:00')
UNION SELECT 'AAA' , 6 , 'WHATEVER11' , CONVERT(DATETIME, '2016-11-20 00:45:00')
UNION SELECT 'QRS' , 7 , 'WHATEVER' , CONVERT(DATETIME, '2016-11-20 00:46:00')
UNION SELECT 'QRS' , 8 , 'WHATEVER12' , CONVERT(DATETIME, '2016-11-20 00:59:00')
UNION SELECT 'AAA' , 9 , 'WHATEVER12' , CONVERT(DATETIME, '2016-11-20 01:02:00')
UNION SELECT 'AAA' ,10 , 'WHATEVER12' , CONVERT(DATETIME, '2016-11-20 01:17:00')
UNION SELECT 'QRS' ,11 , 'WHATEVER13' , CONVERT(DATETIME, '2016-11-20 01:30:00')
), firstRecord AS (
SELECT SPECIAL_ID, MIN(OTHER_ID) AS OTHER_ID
FROM testdata
GROUP BY SPECIAL_ID
), nextRecord1 AS (
SELECT I1.SPECIAL_ID, I1.OTHER_ID AS OTHER_ID, MIN(I2.OTHER_ID) AS next_OTHER_ID
FROM testdata I1
INNER JOIN testdata I2
ON I1.SPECIAL_ID = I2.SPECIAL_ID
AND I1.OTHER_ID < I2.OTHER_ID
AND I2.[TIMESTAMP] >= DATEADD(minute, 25, I1.[TIMESTAMP])
GROUP BY I1.SPECIAL_ID, I1.OTHER_ID
), nextRecord2 AS (
SELECT SPECIAL_ID, MIN(OTHER_ID) AS OTHER_ID, next_OTHER_ID
FROM nextRecord1
GROUP BY SPECIAL_ID, next_OTHER_ID
)
SELECT T2.*
FROM firstRecord T1
INNER JOIN testdata T2
ON T1.OTHER_ID = T2.OTHER_ID
UNION
SELECT T2.*
FROM nextRecord2 T1
INNER JOIN testdata T2
ON T1.next_OTHER_ID = T2.OTHER_ID
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With