Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get list with start and end values from table of datetimes

Currently i have a table built up like this way

DeviceID      Timestamp            Value
----------------------------------------
Device1       1.1.2011 10:00:00    3
Device1       1.1.2011 10:00:01    4
Device1       1.1.2011 10:00:02    4
Device1       1.1.2011 10:00:04    3
Device1       1.1.2011 10:00:05    4
Device1       1.1.2011 14:23:14    8
Device1       1.1.2011 14:23:15    7
Device1       1.1.2011 14:23:17    4
Device1       1.1.2011 14:23:18    2

As you can see there are coming in some values from a device with a given timestamp (the column type is datetime).

The problem is that the device can be started and stopped at any point and there is no direct information within the data that a start or stop had occured. But from the list of given timestamps it is quite easy to tell when a start and stop had occured, due to the fact that whenever the timestamps of two rows is within five seconds, they belong to the same measurement.

Now i'd like to get out of this data a list like this:

DeviceID      Started              Ended
Device1       1.1.2011 10:00:00    1.1.2011 10:00:05
Device1       1.1.2011 14:23:14    1.1.2011 14:23:18

So any ideas how to do this in a fast way? All i can think about is using some kind of cursor and compare each datetime pair by hand. But i think this will get really slow cause we have to inspect each value in each row.

So is there any better SQL solution which won't work with cursors?

Update

Currently i tested all the given answers. And by reading they all look good and had some interesting approaches. Unfortunately all of them (so far) failed on the real data. The biggest problem seems to be the mass of the data (currently their are round about 3.5 millions entries in the table). Performing the given query only on a small subset leads to the expected results, but rolling the query onto the whole table just leads to a very bad performance.

I have to further test and examine if i can chunkify the data and only pass a part of the data to one of these given algorithms to get this thing rolling. But maybe one of you has another smart idea to get the results a little bit faster.

Update (More informations about structure)

Okay, these informations might help, too: Currently there are round about 3.5 million entries in the table. And here are the given column types and indizes:

  • _ID
    • int
    • Primary Key
    • Grouped Index
    • didn't mentioned this column in my example, cause it isn't needed for this query
  • DeviceID
    • int
    • not null
    • Index
  • Timestamp
    • datetime
    • not null
    • Index
  • Value
    • several not indexed columns of different types (int, real, tinyint)
    • all can be null

Maybe this helps to improve your already (or new) solutions to the given problem.

like image 228
Oliver Avatar asked May 16 '11 13:05

Oliver


1 Answers

-- Table var to store the gaps
declare @T table
(
  DeviceID varchar(10),
  PrevPeriodEnd datetime,
  NextPeriodStart datetime
)

-- Get the gaps
;with cte as 
(
  select *,
    row_number() over(partition by DeviceID order by Timestamp) as rn
  from data
)
insert into @T
select
  C1.DeviceID,
  C1.Timestamp as PrevPeriodEnd,
  C2.Timestamp as NextPeriodStart
from cte as C1
  inner join cte as C2
    on C1.rn = C2.rn-1 and
       C1.DeviceID = C2.DeviceID and
       datediff(s, C1.Timestamp, C2.Timestamp) > 5

-- Build islands from gaps in @T
;with cte1 as
(
  -- Add first and last timestamp to gaps
  select DeviceID, PrevPeriodEnd, NextPeriodStart
  from @T
  union all
  select DeviceID, max(TimeStamp) as PrevPeriodEnd, null as NextPeriodStart
  from data
  group by DeviceID
  union all
  select DeviceID, null as PrevPeriodEnd, min(TimeStamp) as PrevPeriodEnd
  from data
  group by DeviceID
),
cte2 as
(
  select *,
    row_number() over(partition by DeviceID order by PrevPeriodEnd) as rn
  from cte1
)
select
  C1.DeviceID,
  C1.NextPeriodStart as PeriodStart,
  C2.PrevPeriodEnd as PeriodEnd
from cte2 as C1
  inner join cte2 as C2
    on C1.DeviceID = C2.DeviceID and
       C1.rn = C2.rn-1
order by C1.DeviceID, C1.NextPeriodStart       
like image 179
Mikael Eriksson Avatar answered Nov 01 '22 05:11

Mikael Eriksson