Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge adjacent rows in SQL?

I'm doing some reporting based on the blocks of time employees work. In some cases, the data contains two separate records for what really is a single block of time.

Here's a basic version of the table and some sample records:

EmployeeID
StartTime
EndTime

Data:

EmpID      Start         End
----------------------------
#1001   10:00 AM    12:00 PM
#1001    4:00 PM     5:30 PM
#1001    5:30 PM     8:00 PM

In the example, the last two records are contiguous in time. I'd like to write a query that combines any adjacent records so the result set is this:

EmpID      Start         End
----------------------------
#1001   10:00 AM    12:00 PM
#1001    4:00 PM     8:00 PM

Ideally, it should also be able to handle more than 2 adjacent records, but that is not required.

like image 428
poke Avatar asked Oct 05 '22 20:10

poke


2 Answers

This article provides quite a few possible solutions to your question

http://www.sqlmag.com/blog/puzzled-by-t-sql-blog-15/tsql/solutions-to-packing-date-and-time-intervals-puzzle-136851

This one seems like the most straight forward:

WITH StartTimes AS
(
  SELECT DISTINCT username, starttime
  FROM dbo.Sessions AS S1
  WHERE NOT EXISTS
    (SELECT * FROM dbo.Sessions AS S2
     WHERE S2.username = S1.username
       AND S2.starttime < S1.starttime
       AND S2.endtime >= S1.starttime)
),
EndTimes AS
(
  SELECT DISTINCT username, endtime
  FROM dbo.Sessions AS S1
  WHERE NOT EXISTS
    (SELECT * FROM dbo.Sessions AS S2
     WHERE S2.username = S1.username
       AND S2.endtime > S1.endtime
       AND S2.starttime <= S1.endtime)
)
SELECT username, starttime,
  (SELECT MIN(endtime) FROM EndTimes AS E
   WHERE E.username = S.username
     AND endtime >= starttime) AS endtime
FROM StartTimes AS S;
like image 141
EkoostikMartin Avatar answered Oct 13 '22 12:10

EkoostikMartin


If this is strictly about adjacent rows (not overlapping ones), you could try the following method:

  1. Unpivot the timestamps.

  2. Leave only those that have no duplicates.

  3. Pivot the remaining ones back, coupling every Start with the directly following End.

Or, in Transact-SQL, something like this:

WITH unpivoted AS (
  SELECT
    EmpID,
    event,
    dtime,
    count = COUNT(*) OVER (PARTITION BY EmpID, dtime)
  FROM atable
  UNPIVOT (
    dtime FOR event IN (StartTime, EndTime)
  ) u
)
, filtered AS (
  SELECT
    EmpID,
    event,
    dtime,
    rowno = ROW_NUMBER() OVER (PARTITION BY EmpID, event ORDER BY dtime)
  FROM unpivoted
  WHERE count = 1
)
, pivoted AS (
  SELECT
    EmpID,
    StartTime,
    EndTime
  FROM filtered
  PIVOT (
    MAX(dtime) FOR event IN (StartTime, EndTime)
  ) p
)
SELECT *
FROM pivoted
;

There's a demo for this query at SQL Fiddle.

like image 21
Andriy M Avatar answered Oct 13 '22 10:10

Andriy M