I'm new to this site but bear with me.
I'm trying to GROUP BY
some data using SQL Server.
Here's the data:
Computer VisitDate
ComputerA 2012-04-28 09:00:00
ComputerA 2012-04-28 09:05:00
ComputerA 2012-04-28 09:10:00
ComputerB 2012-04-28 09:30:00
ComputerB 2012-04-28 09:32:00
ComputerB 2012-04-28 09:44:00
ComputerB 2012-04-28 09:56:00
ComputerB 2012-04-28 10:25:00
ComputerA 2012-04-28 12:25:00
ComputerC 2012-04-28 12:30:00
ComputerC 2012-04-28 12:35:00
ComputerC 2012-04-28 12:45:00
ComputerC 2012-04-28 12:55:00
What I'm trying to achieve is to group the data by Computer but also group if a Computer has a difference between a visit time longer than 1 hour. Here is the result of what I'm trying to do:
Computer VisitDate
ComputerA 2012-04-28 09:00:00
ComputerB 2012-04-28 09:30:00
ComputerA 2012-04-28 12:25:00
ComputerC 2012-04-28 12:30:00
So Computer A
is shown twice because it visited at 09:10:00 and then visited again at 12:25:00 which means a difference of over 1 hour.
It's easy to 'GROUP BY Computer' but the other, I wouldn't know where to start. Any help on this problem would be much appreciated.
You cannot do this with a simple GROUP BY
. This operator only works on single columns - e.g. you could group by computer name or something, but you cannot add additional logic like difference in time must be greater than one hour or anything like that to the grouping.
What you can do - provided you're on SQL Server 2005 or newer (you didn't mention the version in your question) would be to use CTE's (Common Table Expressions). Those provide a way to slice'n'dice your data.
Here, I'm doing several things - first I'm "partitioning" the data by ComputerName
and order by VisitDate
and using ROW_NUMBER()
to get a sequential number for each partition. Then the second CTE determines the "first" entry for each computer - the one with row number = 1 - and the third finally determines the difference in the VisitDate
for each entry, compared to the entry with row number = 1. From that third CTE, I finally select those entries that either have row number = 1 (the first for each "partition"), or anything that has a difference in minutes of 60 or more.
Here's the code:
;WITH Computers AS
(
SELECT
ComputerName, VisitDate,
RN = ROW_NUMBER() OVER(PARTITION BY ComputerName ORDER BY VisitDate)
FROM
dbo.YourComputerTable
),
FirstComputers AS
(
SELECT ComputerName, VisitDate
FROM Computers
WHERE RN = 1
),
SelectedComputers AS
(
SELECT
c.ComputerName, c.VisitDate, c.RN,
DiffToFirst = ABS(DATEDIFF(MINUTE, c.VisitDate, fc.VisitDate))
FROM Computers c
INNER JOIN FirstComputers fc ON c.ComputerName = fc.ComputerName
)
SELECT *
FROM SelectedComputers
WHERE RN = 1 OR DiffToFirst >= 60
If you've upgraded to SQL Server 2012, you can use LAG for this.
with Lagged as (
select
Computer,
VisitDate,
LAG(VisitDate,1) over (
partition by Computer
order by VisitDate
) as LastVisit
from @Visit
)
select
Computer,
VisitDate
from Lagged
where LastVisit is null
or VisitDate > dateadd(hour,1,LastVisit);
SQL Fiddle here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With