Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sql Server 2008 - PIVOT without Aggregation Function

I know you've got multiple topics touching on this. But, I havent found one that addressed my needs. I need to (on demand) pivot select deep table data to a wide output table. The gotcha in this is that I cannot use an aggregate with Pivot because it eats responses that are needed in the output. I have worked up to a solution, but I don't think it's the best because it will require umpteen left joins to work. I've included all attempts and notes as follows:

-- Sql Server 2008 db.
-- Deep table structure (not subject to modification) contains name/value pairs with a userId as
-- foreign key.  In many cases there can be MORE THAN ONE itemValue given by the user for the
-- itemName such as if asked their race, can answer White + Hispanic, etc.  Each response is stored
-- as a seperate record - this cannot currently be changed.

-- Goal: pivot deep data to wide while also compressing result 
-- set down. Account for all items per userId, and duplicating
-- column values (rather than show nulls) as applicable

-- Sample table to store some data of both single and multiple responses
DECLARE @testTable AS TABLE(userId int, itemName varchar(50), itemValue varchar(255))

INSERT INTO @testTable
SELECT 1, 'q01', '1-q01 Answer'
UNION SELECT 1, 'q02', '1-q02 Answer'
UNION SELECT 1, 'q03', '1-q03 Answer 1'
UNION SELECT 1, 'q03', '1-q03 Answer 2'
UNION SELECT 1, 'q03', '1-q03 Answer 3'
UNION SELECT 1, 'q04', '1-q04 Answer'
UNION SELECT 1, 'q05', '1-q05 Answer'
UNION SELECT 2, 'q01', '2-q01 Answer'
UNION SELECT 2, 'q02', '2-q02 Answer'
UNION SELECT 2, 'q03', '2-q03 Answer 1'
UNION SELECT 2, 'q03', '2-q03 Answer 2'
UNION SELECT 2, 'q04', '2-q04 Answer'
UNION SELECT 2, 'q05', '2-q05 Answer'

SELECT 'Raw Data'
SELECT * FROM @TestTable

SELECT 'Using Pivot - shows aggregate result of itemValue per itemName - eats others'
; WITH Data AS (
    SELECT
        [userId]
        , [itemName]
        , [itemValue]
    FROM 
        @testTable
)
SELECT
    [userId]
    , [q02]
    , [q03]
    , [q05]
FROM
    Data
PIVOT
(
    MIN(itemValue)  -- Aggregate function eats needed values.
    FOR itemName in ([q02], [q03], [q05])
) AS PivotTable


SELECT 'Aggregate with Grouping - Causes Null Values'
SELECT
    DISTINCT userId 
    ,[q02] = Max(CASE WHEN itemName = 'q02' THEN itemValue END)
    ,[q03] = Max(CASE WHEN itemName = 'q03' THEN itemValue END)
    ,[q05] = Max(CASE WHEN itemName = 'q05' THEN itemValue END)
FROM
    @testTable
WHERE
    itemName in ('q02', 'q03', 'q05')   -- Makes it a hair quicker
GROUP BY
    userId  -- If by userId only, it only gives 1 row PERIOD = BAD!!
    , [itemName]
    , [itemValue]


SELECT 'Multiple Left Joins - works properly but bad if pivoting 175 columns or so'
; WITH Data AS (
    SELECT
        userId 
        ,[itemName]
        ,[itemValue]
    FROM
        @testTable
    WHERE
        itemName in ('q02', 'q03', 'q05')   -- Makes it a hair quicker
)
SELECT
    DISTINCT s1.userId
    ,[q02] = s2.[itemValue]
    ,[q03] = s3.[itemValue]
    ,[q05] = s5.[itemValue]
FROM
    Data s1
    LEFT JOIN Data s2 
        ON s2.userId = s1.userId 
            AND s2.[itemName] = 'q02'
    LEFT JOIN Data s3 
        ON s3.userId = s1.userId 
            AND s3.[itemName] = 'q03'
    LEFT JOIN Data s5 
        ON s5.userId = s1.userId 
            AND s5.[itemName] = 'q05'

So the bottom query is the only one (so far) that does what I need it to do, but the LEFT JOIN's WILL get out of hand and cause performance issues when I use actual item names to pivot. Any recommendations are appreciated.

like image 962
Danny Grogan Avatar asked Oct 21 '11 19:10

Danny Grogan


People also ask

When creating a pivot table what would happen if we don't specify any aggregation function?

2. Aggregate on specific features with values. The value parameter is where we tell the function which features to aggregate on. It is an optional field and if we don't specify this value, then the function will aggregate on all the numerical features of the dataset.

Do not aggregate values pivot table?

Inside the Pivot Column dialog, select the column with the values that will populate the new columns to be created. In this case "Time" but could be any field type, including text. In the Advanced Options part, select "Don´t Aggregate" so the values will displayed without any modification.

Can GROUP BY be used without aggregate functions?

GROUP BY without Aggregate Functions Although most of the times GROUP BY is used along with aggregate functions, it can still still used without aggregate functions — to find unique records.

Which is not a aggregation function in SQL?

Which of the following is not a built in aggregate function in SQL? Explanation: SQL does not include total as a built in aggregate function. The avg is used to find average, max is used to find the maximum and the count is used to count the number of values. 2.


2 Answers

I think you'll have to stick with joins, because joins are exactly the way of producing results like the one you are after. The purpose of a join is to combine row sets together (on a condition or without any), and your target output is nothing else than a combination of subsets of rows.

However, if the majority of questions always have single responses, you could substantially reduce the number of necessary joins. The idea is to join only multiple-response groups as separate row sets. As for the single-response items, they are joined only as part of the entire dataset of target items.

An example should better illustrate what I might poorly describe verbally. Assuming there are two potentially multiple-response groups in the source data, 'q03' and 'q06' (actually, here's the source table:

DECLARE @testTable AS TABLE(
  userId int,
  itemName varchar(50),
  itemValue varchar(255)
);

INSERT INTO @testTable
SELECT 1, 'q01', '1-q01 Answer'
UNION SELECT 1, 'q02', '1-q02 Answer'
UNION SELECT 1, 'q03', '1-q03 Answer 1'
UNION SELECT 1, 'q03', '1-q03 Answer 2'
UNION SELECT 1, 'q03', '1-q03 Answer 3'
UNION SELECT 1, 'q04', '1-q04 Answer'
UNION SELECT 1, 'q05', '1-q05 Answer'
UNION SELECT 1, 'q06', '1-q06 Answer 1'
UNION SELECT 1, 'q06', '1-q06 Answer 2'
UNION SELECT 1, 'q06', '1-q06 Answer 3'
UNION SELECT 2, 'q01', '2-q01 Answer'
UNION SELECT 2, 'q02', '2-q02 Answer'
UNION SELECT 2, 'q03', '2-q03 Answer 1'
UNION SELECT 2, 'q03', '2-q03 Answer 2'
UNION SELECT 2, 'q04', '2-q04 Answer'
UNION SELECT 2, 'q05', '2-q05 Answer'
UNION SELECT 2, 'q06', '2-q06 Answer 1'
UNION SELECT 2, 'q06', '2-q06 Answer 2'
;

which is same as the table in the original post, but with added 'q06' items), the resulting script could be like this:

WITH ranked AS (
  SELECT
    *,
    rn = ROW_NUMBER() OVER (PARTITION BY userId, itemName ORDER BY itemValue)
  FROM @testTable
),
multiplied AS (
  SELECT
    r.userId,
    r.itemName,
    r.itemValue,
    rn03 = r03.rn,
    rn06 = r06.rn
  FROM ranked r03
    INNER JOIN ranked r06 ON r03.userId = r06.userId AND r06.itemName = 'q06'
    INNER JOIN ranked r ON r03.userId = r.userId AND (
      r.itemName = 'q03' AND r.rn = r03.rn OR
      r.itemName = 'q06' AND r.rn = r06.rn OR
      r.itemName NOT IN ('q03', 'q06')
    )
  WHERE r03.itemName = 'q03'
    AND r.itemName IN ('q02', 'q03', 'q05', 'q06')
)
SELECT userId, rn03, rn06, q02, q03, q05, q06
FROM multiplied
PIVOT (
  MIN(itemValue)  
  FOR itemName in (q02, q03, q05, q06)
) AS PivotTable
like image 139
Andriy M Avatar answered Oct 21 '22 16:10

Andriy M


; WITH SRData AS (
    SELECT  -- Only query single response items in this block
        [userId]
        , [q01]
        , [q02]
        , [q04]
        , [q05]
    FROM
        @testTable
    PIVOT
    (
        MIN(itemValue) 
        FOR itemName in ([q01], [q02], [q04], [q05])
    ) AS PivotTable
)
SELECT
    sr.[userId]
    , sr.[q01]
    , sr.[q02]  
    , [q03] = mr03.[itemValue]
    , sr.[q04]      
    , sr.[q05]      
    , [q06] = mr06.[itemValue]
FROM
    SRData sr
    LEFT JOIN @testTable mr03 ON mr03.userId = sr.userId AND mr03.itemName = 'q03'  -- Muli Response for q03
    LEFT JOIN @testTable mr06 ON mr06.userId = sr.userId AND mr06.itemName = 'q06'  -- Muli Response for q06

like image 24
Danny Grogan Avatar answered Oct 21 '22 18:10

Danny Grogan