I am trying to find an elegant solution in the form of a SQL query for the following problem.
New records will be inserted in the Log table. I need to detect any new records (inserted in the last hour) that I haven't seen before and generate an alert (e.g. # of these records > 0)
ID, Url, DOB
1, site1.com/page1, "5/06/2012 20:01"
2, site2.com/page2, "5/06/2012 21:20"
3, site1.com/page1, "6/06/2012 10:05"
If "now" is 6/06/2012 10:40 - I see that there was 1 new record (id=3) inserted but I don't want to generate an alert because we have seen this URL before (id=1).
if we have 4, site3.com/pageX, "6/06/2012 10:08" then I want to generate an alert (return count=1) because this row was inserted in the last hour and we haven't seen it before.
What is the best way to implement it? ideally without nested queries
I think this is what you are after. This will retrieve new entries in the last hour (Where new means the same URL has not been visited unitl the last hour)
SELECT *
FROM Log
WHERE DOB > DATEADD(HOUR, -1, CURRENT_TIMESTAMP)
AND NOT EXISTS
( SELECT 1
FROM Log T1
WHERE T1.URL = Log.URL
AND T1.DOB < DATEADD(HOUR, -1, CURRENT_TIMESTAMP)
)
Working example on SQL Fiddle
EDIT
Just seen a comment that you only need a count:
SELECT COUNT(*)
FROM Log
WHERE DOB > DATEADD(HOUR, -1, CURRENT_TIMESTAMP)
AND NOT EXISTS
( SELECT 1
FROM Log T1
WHERE T1.URL = Log.URL
AND T1.DOB < DATEADD(HOUR, -1, CURRENT_TIMESTAMP)
)
EDIT 2
I am not sure why there is the requirement of only being a single select, however, the closest I can get to a single select is this:
SELECT COUNT(*)
FROM ( SELECT *, MIN(DOB) OVER(PARTITION BY URL) [FirstViewed]
FROM Log
) Log
WHERE FirstViewed >= DATEADD(HOUR, -1, CURRENT_TIMESTAMP)
This will still return 2 if the same page has been visited twice in the last hour.
http://sqlfiddle.com/#!3/5a8bc/1
This one does something alternative, first search unique url by grouping, then extract those in the last hour.
SELECT x1.*
FROM
(SELECT URL,
COUNT(ID) AS urlcount,
MAX(DOB) AS uniqueurl
FROM Log
GROUP BY URL HAVING count(ID) = 1
OR MIN(DOB) > dateadd(HOUR ,-1 , CURRENT_TIMESTAMP)) AS x1
WHERE x1.uniqueurl > dateadd(HOUR ,-1 , CURRENT_TIMESTAMP);
http://sqlfiddle.com/#!3/250e0/45/0
I cannot figure out if this has acceptable performance without looking at an explain, but i think the sort operation involved in the group by could be a bottleneck
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With