Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to specify GROUP BY for PIVOT aggregate

Or maybe there's an alternative. The problem is really simple:

I have this information (goes on for 300k more rows):

MachID  DownCode                StartOrEnd  StartTimeEndTime
------------------------------------------------------------
PR01    ColorChg                1337713300  StartTime
PR01    ColorChg                1337713303  EndTime
PR01    ColorChg                1363254067  StartTime
PR01    ColorChg                1363254075  EndTime
PR01    ColorChg                1363259848  StartTime
PR01    ColorChg                1363260292  EndTime
...

It is a CTE called 'cte_dl2'.

I'm pivoting the data this way:

SELECT * FROM
(
    SELECT *
    FROM cte_dl2
) Temp
PIVOT
(
    MAX(StartOrEnd)
    FOR StartTimeEndTime IN ([StartTime], [EndTime])
) Pvt

Which gets me:

MachID  DownCode                StartTime   EndTime
------------------------------------------------------
PR01    ColorChg                1375207208  1375207316
PR01    COMP                    1412124847  1412131608
PR01    DIE SET                 1408502593  1408502595
PR01    DieStart                1397704258  1397704381
PR01    FeedLoad                1375099369  1375099506
...

You can see the problem here: it automatically does a GROUP BY on all columns not specified in the PIVOT FOR IN (), so I only get the most recent StartTime and EndTime for each MachID/DownCode rather than each individual record.

If you can't see the problem, here's what I'm trying to get:

MachID  DownCode                StartTime   EndTime
------------------------------------------------------------
PR01    ColorChg                1337713300  1337713303
PR01    ColorChg                1363254067  1363254075
PR01    ColorChg                1363259848  1363260292
...

Please help! I already have ways around this, but they aren't as fast as UNPIVOT:

  • My way (not shown) = 6s
  • UNPIVOT = 3s

So I'd like to continue using UNPIVOT or equivalent.

The ordering of the rows is specified in the comments below.


To clarify for anyone with a similar question, the answer is to create a column where every row has a unique value, because PIVOT always does an implicit GROUP BY on the non pivoted columns.

As long as one column is completely unique, this implicit GROUP BY essentially has no effect.

Note: I haven't actually looked at the execution plan to see if PIVOT works exactly this way, but in the abstract it appears to, and that is why the chosen answer works.

like image 243
KthProg Avatar asked Feb 11 '23 19:02

KthProg


1 Answers

It's not exactly clear how you are associating the StartTime and EndTime to each other but you should be able to use row_number() to return more than one row combination of MachId, DownCode. This will create another column that will be unique enough to return multiple rows on the final select:

select machid,
  downcode,
  StartTime, 
  EndTime
from
(
  select machid,
    downcode,
    startorend,
    starttimeendtime,
    rn = row_number() over(partition by machid, downcode, StartTimeEndTime
                            order by startorend) 
  from cte_dl2
) d
pivot
(
  max(startorend)
  for starttimeendtime in (StartTime, EndTime)
) piv;

See SQL Fiddle with Demo. Note, this assumes that you want the StartTime and EndTime based on the StartOrEnd value, however data in tables is not inherently ordered - it would be significantly easier to get the correct order if you had a column that you could use to place the data in a specific order.

If you don't want to use PIVOT this could be accomplished, using a aggregate function along with a CASE expression:

select 
  machid,
  downcode,
  StartTime = max(case when starttimeendtime = 'StartTime' then startorend else null end),
  EndTime = max(case when starttimeendtime = 'EndTime' then startorend else null end) 
from
(
  select machid,
    downcode,
    startorend,
    starttimeendtime,
    rn = row_number() over(partition by machid, downcode, StartTimeEndTime
                            order by startorend) 
  from cte_dl2
) d
group by machid, downcode, rn;

See SQL Fiddle with Demo. You'll get the same result with either version:

| MACHID | DOWNCODE |  STARTTIME |    ENDTIME |
|--------|----------|------------|------------|
|   PR01 | ColorChg | 1337713300 | 1337713303 |
|   PR01 | ColorChg | 1363254067 | 1363254075 |
|   PR01 | ColorChg | 1363259848 | 1363260292 |
like image 88
Taryn Avatar answered Feb 14 '23 08:02

Taryn