Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to consolidate blocks of time?

I have a derived table with a list of relative seconds to a foreign key (ID):

CREATE TABLE Times (
    ID INT
  , TimeFrom INT
  , TimeTo INT
);

The table contains mostly non-overlapping data, but there are occasions where I have a TimeTo < TimeFrom of another record:

+----+----------+--------+
| ID | TimeFrom | TimeTo |
+----+----------+--------+
| 10 | 10       | 30     |
| 10 | 50       | 70     |
| 10 | 60       | 150    |
| 10 | 75       | 150    |
| .. | ...      | ...    |
+----+----------+--------+

The result set is meant to be a flattened linear idle report, but with too many of these overlaps, I end up with negative time in use. I.e. If the window above for ID = 10 was 150 seconds long, and I summed the differences of relative seconds to subtract from the window size, I'd wind up with 150-(20+20+90+75)=-55. This approach I've tried, and is what led me to realizing there were overlaps that needed to be flattened.

So, what I'm looking for is a solution to flatten the overlaps into one set of times:

+----+----------+--------+
| ID | TimeFrom | TimeTo |
+----+----------+--------+
| 10 | 10       | 30     |
| 10 | 50       | 150    |
| .. | ...      | ...    |
+----+----------+--------+

Considerations: Performance is very important here, as this is part of a larger query that will perform well on it's own, and I'd rather not impact its performance much if I can help it.

On a comment regarding "Which seconds have an interval", this is something I have tried for the end result, and am looking for something with better performance. Adapted to my example:

SELECT SUM(C.N)
FROM (
    SELECT A.N, ROW_NUMBER()OVER(ORDER BY A.N) RowID
    FROM 
          (SELECT TOP 60 1 N FROM master..spt_values) A
        , (SELECT TOP 720 1 N FROM master..spt_values) B
    ) C
WHERE EXISTS (
        SELECT 1
        FROM Times SE
        WHERE SE.ID = 10
            AND SE.TimeFrom <= C.RowID
            AND SE.TimeTo >= C.RowID
            AND EXISTS (
                SELECT 1
                FROM Times2 D
                WHERE ID = SE.ID
                    AND D.TimeFrom <= C.RowID
                    AND D.TimeTo >= C.RowID
            )
        GROUP BY SE.ID
    )

The problem I have with this solution is I have get a Row Count Spool out of the EXISTS query in the query plan with a number of executions equal to COUNT(C.*). I left the real numbers in that query to illustrate that getting around this approach is for the best. Because even with a Row Count Spool reducing the cost of the query by quite a bit, it's execution count increases the cost of the query as a whole by quite a bit as well.

Further Edit: The end goal is to put this in a procedure, so Table Variables and Temp Tables are also a possible tool to use.

like image 223
Jaaz Cole Avatar asked Jul 24 '14 21:07

Jaaz Cole


2 Answers

OK. I'm still trying to do this with just one SELECT. But This totally works:

DECLARE @tmp TABLE (ID INT, GroupId INT, TimeFrom INT, TimeTo INT)

INSERT INTO @tmp
    SELECT ID, 0, TimeFrom, TimeTo 
    FROM Times
    ORDER BY Id, TimeFrom

DECLARE @timeTo int, @id int, @groupId int

SET @groupId = 0

UPDATE @tmp
SET 
    @groupId = CASE WHEN id != @id THEN 0 
                    WHEN TimeFrom > @timeTo THEN @groupId + 1 
                    ELSE @groupId END,
    GroupId = @groupId,
    @timeTo = TimeTo,
    @id = id    

SELECT Id, MIN(TimeFrom), Max(TimeTo) FROM @tmp 
GROUP BY ID, GroupId ORDER BY ID
like image 86
Alireza Avatar answered Sep 17 '22 13:09

Alireza


Left join each row to its successor overlapping row on the same ID value (where such exist).

Now for each row in the result-set of LHS left join RHS the contribution to the elapsed time for the ID is:

isnull(RHS.TimeFrom,LHS.TimeTo) - LHS.TimeFrom as TimeElapsed

Summing these by ID should give you the correct answer.

Note that:
- where there isn't an overlapping successor row the calculation is simply
LHS.TimeTo - LHS.TimeFrom
- where there is an overlapping successor row the calculation will net to
(RHS.TimeFrom - LHS.TimeFrom) + (RHS.TimeTo - RHS.TimeFrom)
which simplifies to
RHS.TimeTo - LHS.TimeFrom

like image 34
Pieter Geerkens Avatar answered Sep 21 '22 13:09

Pieter Geerkens