The background:
Think of an application that lets people make surveys with custom questions, In a particular case, interview families, An interviewer goes to
House 1
and interviews two membersMember 1
andMember 2
. He asks questions like. What is this house address?,What is your name and age?. The answers for that is common for the Members and the answers that are specific for them are stored in the same table
After doing some Joining on some tables and pivoting the result I end up getting the following table structure.
What was achieved so far
| ID | ADDRESS | MEMBER | AGE | SubformIteration |
|----|---------|----------|--------|-------------------|
| 1 | HOUSE 1 | (null) | (null) | (null) |
| 1 | (null) | MEMBER h | 18 | s0 |
| 1 | (null) | MEMBER i | 19 | s1 |
| 2 | HOUSE 2 | (null) | (null) | (null) |
| 2 | (null) | MEMBER x | 36 | s0 |
| 2 | (null) | MEMBER y | 35 | s1 |
| 3 | HOUSE 3 | (null) | (null) | (null) |
| 3 | (null) | MEMBER a | 18 | s0 |
| 3 | (null) | MEMBER b | 19 | s1 |
I am trying to find a way to get the table to be formatted as below:
Desired output
| ID | ADDRESS | MEMBER | AGE | SubformIteration |
|----|---------|----------|--------|-------------------|
| 1 | HOUSE 1 | MEMBER 1 | 18 | s0 |
| 1 | HOUSE 1 | MEMBER 2 | 19 | s1 |
| 2 | HOUSE 2 | MEMBER x | 36 | s0 |
| 2 | HOUSE 2 | MEMBER y | 35 | s1 |
| 3 | HOUSE 3 | MEMBER a | 18 | s0 |
| 3 | HOUSE 3 | MEMBER b | 19 | s1 |
I do not have enough sql vocabulary to describe and search the operation/procedure required to so As I am new to SQL and I would be really thankful if anybody could tell me an efficient way to achieve this.
Important
DO NOT RELY UPON THE QuestionText
column as it will be changes When somebody decided to change the questions
Edit
Source tables
Sql fiddle link with all the below tables
As per the suggestions in the answers, I am posting the source table and the queries in hope that there will be a better understanding of the problem
Questions
table
+------------+--------------+---------+----------+---------------+
| QuestionID | QuestionText | type | SurveyID | IsIncremental |
+------------+--------------+---------+----------+---------------+
| 3483 | subform | subform | 311 | 1 |
| 3484 | MEMBER | text | 311 | 0 |
| 3485 | AGE | number | 311 | 0 |
| 3486 | ADDRESS | address | 311 | 0 |
+------------+--------------+---------+----------+---------------+
Results
table
+----------+-------------------------+----------+
| ResultID | DateSubmitted | SurveyID |
+----------+-------------------------+----------+
| 2272 | 2017-04-12 05:11:41.477 | 311 |
| 2273 | 2017-04-12 05:12:22.227 | 311 |
| 2274 | 2017-04-12 05:13:02.227 | 311 |
+----------+-------------------------+----------+
Chunks
table, where all the answers are stored:
+---------+------------+----------+------------+------------------+
| ChunkID | Answer | ResultID | QuestionID | SubFormIteration |
+---------+------------+----------+------------+------------------+
| 9606 | HOUSE 1 | 2272 | 3486 | NULL |
| 9607 | MEMEBER 1 | 2272 | 3484 | NULL |
| 9608 | 12 | 2272 | 3485 | NULL |
| 9609 | MEMBER 2 | 2272 | 3484 | s1 |
| 9610 | 10 | 2272 | 3485 | s1 |
| 9611 | MEMEBER 1 | 2272 | 3484 | s0 |
| 9612 | 12 | 2272 | 3485 | s0 |
| 9613 | MEMBER 2 | 2272 | 3484 | s1 |
| 9614 | 10 | 2272 | 3485 | s1 |
| 9615 | HOUSE 2 | 2273 | 3486 | NULL |
| 9616 | MEMBER A | 2273 | 3484 | NULL |
| 9617 | 23 | 2273 | 3485 | NULL |
| 9618 | MEMBER B | 2273 | 3484 | s1 |
| 9619 | 25 | 2273 | 3485 | s1 |
| 9620 | MEMBER A | 2273 | 3484 | s0 |
| 9621 | 23 | 2273 | 3485 | s0 |
| 9622 | MEMBER B | 2273 | 3484 | s1 |
| 9623 | 25 | 2273 | 3485 | s1 |
| 9624 | HOUSE 3 | 2274 | 3486 | NULL |
| 9625 | MEMBER K | 2274 | 3484 | NULL |
| 9626 | 41 | 2274 | 3485 | NULL |
| 9627 | MEMBER J | 2274 | 3484 | s1 |
| 9628 | 26 | 2274 | 3485 | s1 |
| 9629 | MEMBER K | 2274 | 3484 | s0 |
| 9630 | 41 | 2274 | 3485 | s0 |
| 9631 | MEMBER J | 2274 | 3484 | s1 |
| 9632 | 26 | 2274 | 3485 | s1 |
+---------+------------+----------+------------+------------------+
I've written the following stored procedure which yields the first ever table given in this question:
ALTER PROCEDURE [dbo].[ResultForSurvey] @SurveyID int
AS
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX),@colsAggregated as nvarchar(max);
IF OBJECT_ID('tempdb.dbo.#Temp', 'U') IS NOT NULL
DROP TABLE #Temp;
SELECT *
INTO #Temp
FROM (Select Answer=( case
When Questions.type='checkboxes' or Questions.IsIncremental=1 THEN STUFF((SELECT distinct ',' + c.Answer
FROM Chunks c Where c.ResultID=Results.ResultID and c.QuestionID=Questions.QuestionID and (Chunks.SubFormIteration IS NULL )
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
else Chunks.Answer end),Chunks.SubFormIteration,Questions.QuestionText,Questions.type,Questions.QuestionID,Chunks.ResultID,Results.ResultID as Action,Results.DateSubmitted,Results.Username,Results.SurveyID from Chunks Join Questions on Questions.QuestionID= Chunks.QuestionID Join Results on Results.ResultID=Chunks.ResultID Where Results.SurveyID=@SurveyID) as X
SET @colsAggregated = STUFF((SELECT distinct ','+ 'max('+ QUOTENAME(c.QuestionText)+') as '+ QUOTENAME(c.QuestionText)
FROM #Temp c
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
print @colsAggregated
SET @cols = STUFF((SELECT distinct ',' + QUOTENAME(c.QuestionText)
FROM #Temp c
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT ResultID,max(Username) as Username,max(DateSubmitted) as DateSubmitted,max(SubFormIteration) as SubFormIteration, ' + @colsAggregated + ' from
(
select *
from #Temp
) as y
pivot
(
max(Answer)
for QuestionText in (' + @cols + ')
) as p GROUP BY
ResultID,SubFormIteration'
execute(@query)
You follow these steps to make a query a pivot table: First, select a base dataset for pivoting. Second, create a temporary result by using a derived table or common table expression (CTE) Third, apply the PIVOT operator.
Using a T-SQL Pivot function is one of the simplest method for transposing rows into columns.
It may be beneficial to post the query that got you your original results; there is a possibility that the original query could be rewritten to avoid this complexity. With the given information, this is the most simplistic way of solving this problem:
SELECT
h1.Id,
h2.Address,
h1.Member,
h1.Age,
h1.MemberNo
FROM House h1
INNER JOIN House h2
ON h1.Id = h2.Id
WHERE h2.Address IS NOT NULL -- Eliminates the results whre the Address is NULL after the join
AND h1.Member IS NOT NULL -- Eliminates the results that would show up from the original table (t1) where there is no Member field
Here is a simple example of the table structure using temp tables:
DROP TABLE #Questions
DROP TABLE #Results
DROP TABLE #Chunks
CREATE TABLE #Questions
(
QuestionId INT,
QuestionText VARCHAR(MAX),
type VARCHAR(MAX),
SurveyID INT,
IsIncremental INT
)
CREATE TABLE #Results
(
ResultId INT,
DateSubmitted DATETIME,
SurveyID INT
)
CREATE TABLE #Chunks
(
ChunkId INT,
Answer VARCHAR(MAX),
ResultId INT,
QuestionId INT,
SubFormIteration VARCHAR(20)
)
INSERT INTO #Results
VALUES (2272, '04-12-2017', 311),
(2273, '04-12-2017', 311),
(2274, '04-12-2017', 311)
INSERT INTO #Chunks
VALUES (9606, 'WhiteHouse', 2272, 3486, NULL),
(9607, 'MEMBER 1', 2272, 3484, NULL),
(9608, '12', 2272, 3485, NULL),
(9609, 'MEMBER 2', 2272, 3484, 's1'),
(9610, '10', 2272, 3485, 's1'),
(9611, 'MEMBER 1', 2272, 3484, 's0'),
(9612, '12', 2272, 3485, 's0'),
(9613, 'MEMBER 2', 2272, 3484, 's1'),
(9614, '10', 2272, 3485, 's1'),
(9615, 'RpBhavan', 2273, 3486, NULL),
(9618, 'MEMBER B', 2273, 3484, 's1'),
(9619, '25', 2273, 3485, 's1'),
(9620, 'MEMBER A', 2273, 3484, 's0'),
(9621, '23', 2273, 3485, 's0')
INSERT INTO #Questions
VALUES (3483, 'subform', 'subform', 311, 1),
( 3484, 'MEMBER', 'text', 311, 0 ),
(3485, 'AGE', 'number', 311, 0),
(3486, 'ADDRESS', 'address', 311, 0)
Here is a way to produce the results your looking for without the use of PIVOTs and XML:
; WITH Responses AS (
SELECT
c.ResultId,
QuestionText,
Answer,
c.SubFormIteration
FROM #Chunks c
INNER JOIN #Results r
ON c.ResultId = r.ResultId
INNER JOIN #Questions q
ON q.QuestionId = c.QuestionId
WHERE c.SubFormIteration IS NOT NULL -- Removes the "Address" responses and duplicate Answers
),
FindAddress AS (
-- Pulls ONLY the address for each ResultId
SELECT
ResultId,
MAX(CASE WHEN QuestionText = 'ADDRESS' THEN Answer END) AS [Address]
FROM #Chunks c
INNER JOIN #Questions q
ON q.QuestionId = c.QuestionId
GROUP BY ResultId
)
-- Combines all responses and the address back together
SELECT
r.ResultId,
fa.Address,
MAX(CASE WHEN QuestionText = 'MEMBER' THEN Answer END) AS [MEMBER],
MAX(CASE WHEN QuestionText = 'AGE' THEN Answer END) AS [Age],
SubFormIteration
FROM Responses r
INNER JOIN FindAddress fa
ON fa.ResultId = r.ResultId
GROUP BY r.ResultId, SubFormIteration, fa.Address
Essentially, I broke a rather large query into a Common Table Expression (CTE). Each query had a purpose: a) Response pulls all responses except the address, b) Pulls only the address based on ResultId, and c) Combine both queries together.
The MAX(CASE...) followed by GROUP BY is an alternative method to using PIVOTS and they essentially perform the same.
To apply this query to your specific case, you should only need to change the name of the tables.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With