Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server - Select all top of the hour records

Tags:

sql

sql-server

I have a large table with records created every second and want to select only those records that were created at the top of each hour for the last 2 months. So we would get 24 selected records for every day over the last 60 days

The table structure is Dateandtime, Value1, Value2, etc

Many Thanks

like image 542
user2119912 Avatar asked May 11 '13 16:05

user2119912


2 Answers

You could group by on the date part (cast(col1 as date)) and the hour part (datepart(hh, col1). Then pick the minimum date for each hour, and filter on that:

select  *
from    YourTable yt
join    (
        select  min(dateandtime) as dt
        from    YourTable
        where   datediff(day, dateandtime, getdate()) <= 60
        group by
                cast(dateandtime as date)
        ,       datepart(hh, dateandtime)
        ) filter
on      filter.dt = yt.dateandtime

Alternatively, you can group on a date format that only includes the date and the hour. For example, convert(varchar(13), getdate(), 120) returns 2013-05-11 18.

        ...
        group by
                convert(varchar(13), getdate(), 120)
        ) filter
        ...
like image 161
Andomar Avatar answered Sep 22 '22 04:09

Andomar


For clarity's sake, I would probably use a two-step, CTE-based approach (this works in SQL Server 2005 and newer - you didn't clearly specify which version of SQL Server you're using, so I'm just hoping you're not on an ancient version like 2000 anymore):

-- define a "base" CTE to get the hour component of your "DateAndTime" 
-- column and make it accessible under its own name
;WITH BaseCTE AS
(
    SELECT  
        ID, DateAndTime,
        Value1, Value2, 
        HourPart = DATEPART(HOUR, DateAndTime)
    FROM dbo.YourTable
    WHERE DateAndTime >= @SomeThresholdDateHere
), 
-- define a second CTE which "partitions" the data by this "HourPart",
-- and number all rows for each partition starting at 1. So each "last"
-- event for each hour is the one with the RN = 1 value
HourlyCTE AS 
(
    SELECT ID, DateAndTime, Value1, Value2, 
        RN = ROW_NUMBER() OVER(PARTITION BY HourPart ORDER BY DateAndTime DESC)
    FROM BaseCTE
)
SELECT *
FROM HourlyCTE
WHERE RN=1

Also: I wasn't sure what exactly you mean by "top of the hour" - the row that's been created right at the beginning of each hour (e.g. at 04:00:00) - or rather the last row created in that hour's time span? If you mean the first one for each hour - then you'd need to change the ORDER BY DateAndTime DESC to ORDER BY DateAndTime ASC

like image 31
marc_s Avatar answered Sep 20 '22 04:09

marc_s