How can I merge multiple rows with same ID
into one row.
When value in first and second row in the same column is the same or when there is value in first row and NULL
in second row.
I don't want to merge when value in first and second row in the same column is different.
I have table:
ID |A |B |C
1 NULL 31 NULL
1 412 NULL 1
2 567 38 4
2 567 NULL NULL
3 2 NULL NULL
3 5 NULL NULL
4 6 1 NULL
4 8 NULL 5
4 NULL NULL 5
I want to get table:
ID |A |B |C
1 412 31 1
2 567 38 4
3 2 NULL NULL
3 5 NULL NULL
4 6 1 NULL
4 8 NULL 5
4 NULL NULL 5
I think there's a simpler solution to the above answers (which is also correct). It basically gets the merged values that can be merged within a CTE, then merges that with the data not able to be merged.
WITH CTE AS (
SELECT
ID,
MAX(A) AS A,
MAX(B) AS B,
MAX(C) AS C
FROM dbo.Records
GROUP BY ID
HAVING MAX(A) = MIN(A)
AND MAX(B) = MIN(B)
AND MAX(C) = MIN(C)
)
SELECT *
FROM CTE
UNION ALL
SELECT *
FROM dbo.Records
WHERE ID NOT IN (SELECT ID FROM CTE)
SQL Fiddle: http://www.sqlfiddle.com/#!6/29407/1/0
WITH Collapsed AS (
SELECT
ID,
A = Min(A),
B = Min(B),
C = Min(C)
FROM
dbo.MyTable
GROUP BY
ID
HAVING
EXISTS (
SELECT Min(A), Min(B), Min(C)
INTERSECT
SELECT Max(A), Max(B), Max(C)
)
)
SELECT
*
FROM
Collapsed
UNION ALL
SELECT
*
FROM
dbo.MyTable T
WHERE
NOT EXISTS (
SELECT *
FROM Collapsed C
WHERE T.ID = C.ID
);
This works by creating all the mergeable rows through the use of Min
and Max
--which should be the same for each column within an ID
and which usefully exclude NULL
s--then appending to this list all the rows from the table that couldn't be merged. The special trick with EXISTS ... INTERSECT
allows for the case when a column has all NULL
values for an ID
(and thus the Min
and Max
are NULL
and can't equal each other). That is, it functions like Min(A) = Max(A) AND Min(B) = Max(B) AND Min(C) = Max(C)
but allows for NULL
s to compare as equal.
Here's a slightly different (earlier) solution I gave that may offer different performance characteristics, and being more complicated, I like less, but being a single flowing query (without a UNION
) I kind of like more, too.
WITH Collapsible AS (
SELECT
ID
FROM
dbo.MyTable
GROUP BY
ID
HAVING
EXISTS (
SELECT Min(A), Min(B), Min(C)
INTERSECT
SELECT Max(A), Max(B), Max(C)
)
), Calc AS (
SELECT
T.*,
Grp = Coalesce(C.ID, Row_Number() OVER (PARTITION BY T.ID ORDER BY (SELECT 1)))
FROM
dbo.MyTable T
LEFT JOIN Collapsible C
ON T.ID = C.ID
)
SELECT
ID,
A = Min(A),
B = Min(B),
C = Min(C)
FROM
Calc
GROUP BY
ID,
Grp
;
This is also in the above SQL Fiddle.
This uses similar logic as the first query to calculate whether a group should be merged, then uses this to create a grouping key that is either the same for all rows within an ID
or is different for all rows within an ID
. With a final Min
(Max
would have worked just as well) the rows that should be merged are merged because they share a grouping key, and the rows that shouldn't be merged are not because they have distinct grouping keys over the ID
.
Depending on your data set, indexes, table size, and other performance factors, either of these queries may perform better, though the second query has some work to do to catch up, with two sorts instead of one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With