SQL Fiddle
I have the following table
CREATE TABLE __EpiTest
(
`ActivityRecordID` int,
`ActCstID` varchar(6),
`ResCstID` varchar(6),
`VolAmt` int,
`ActCnt` int,
`TotOCst` int,
`TotCst` int
);
INSERT INTO __EpiTest (`ActivityRecordID`, `ActCstID`, `ResCstID`, `VolAmt`, `ActCnt`, `TotOCst`, `TotCst`)
VALUES (15652, 'DIM008', 'CPF005', 30.455249786377, 1, 0, 0.375024198767061),
(15652, 'DIM008', 'CSC004', 30.455249786377, 1, 7.62176510799961, 11.932578069479),
(15652, 'DIM008', 'REC001', 30.455249786377, 1, 0.17902367836393, 0.384881520159455),
(15652, 'OUT001', 'CPF002', 15, 0, 0, 16.9408193013078),
(15652, 'OUT001', 'CSC001', 15, 0, 2.36971564207042, 2.36971564207042),
(15652, 'OUT001', 'CSC004', 15, 0, 12.3230666021278, 12.3760690367354),
(15652, 'OUT001', 'REC001', 15, 0, 0.377459387378349, 3.0275278374102),
(15652, 'SUP001', 'CPF002', 1, 1, 0, 0.00108648359810756),
(15652, 'SUP001', 'CPF011', 1, 1, 0, -1.89799880202357E-14),
(15652, 'SUP001', 'CPF020', 1, 1, 0, 1.31058251625567E-05),
(15652, 'SUP001', 'CPF021', 1, 1, 0, 25.0942308512551),
(15652, 'SUP001', 'CPF021', 1, 1, 0, 25.0942308512551),
(15652, 'SUP001', 'CSC001', 1, 1, 1.9628769103451, 1.9628769103451),
(15652, 'SUP001', 'CSC001', 1, 1, 1.9628769103451, 1.9628769103451),
(15652, 'SUP001', 'CSC002', 1, 1, 0, 10.2266625467779),
(15652, 'SUP001', 'CSC004', 1, 1, 16.3451721608005, 16.3513319060046),
(15652, 'SUP001', 'CSC004', 1, 1, 16.3451721608005, 16.3513319060046),
(15652, 'SUP001', 'REC001', 1, 1, 0.254410386701976, -6.27048795659376),
(15652, 'SUP001', 'REC001', 1, 1, 0.254410386701976, -6.27048795659376),
(15652, 'SUP001', 'REC002', 1, 1, 0, 1.10781732547441);
Notice there are rows which have matching values for [ActivityRecordID]
, [ActCstID]
and [ResCstID]
. I want to merge these values and sum the values in [TotOCst]
and [TotCst]
. To do this I have attempted to use MERGE
MERGE [__EpiTest] AS Tgt
USING (
SELECT [ActivityRecordID],
[ActCstID],
[ResCstID],
SUM([TotOCst]) AS TotOCst,
SUM([TotCst]) AS TotCst
FROM [__EpiTest]
GROUP BY [ActivityRecordID],
[ActCstID],
[ResCstID]) AS Src
ON (Tgt.[ActivityRecordID] = Src.[ActivityRecordID] AND
Tgt.[ActCstID] = Src.[ActCstID] AND
Tgt.[ResCstID] = Src.[ResCstID])
WHEN MATCHED THEN
UPDATE
SET [TotOCst] = Src.[TotOCst],
[TotCst] = Src.[TotCst]
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
GO
This matches and correctly updates the values in the [TotOCst]
and [TotCst]
for each duplicate, but it then leaves the duplicate rows in the table whereas I want all but one removed. How can I achieve this?
Note, the target table is HUGE, so I would like to try and do this with a single operation by using a variation of the MERGE
query above, or other alternative. Multiple queries will be too expensive for me to deal with...
Illustration
I get
...
15652 SUP001 CPF021 1 1 0 12.5471154256275
15652 SUP001 CPF021 1 1 0 12.5471154256275
15652 SUP001 CSC001 1 1 0.98143845517255 0.98143845517255
15652 SUP001 CSC001 1 1 0.98143845517255 0.98143845517255
15652 SUP001 CSC002 1 1 0 10.2266625467779
15652 SUP001 CSC004 1 1 8.17258608040024 8.17566595300228
15652 SUP001 CSC004 1 1 8.17258608040024 8.17566595300228
15652 SUP001 REC001 1 1 0.127205193350988 -3.13524397829688
15652 SUP001 REC001 1 1 0.127205193350988 -3.13524397829688
...
But I want
...
15652 SUP001 CPF021 1 1 0 12.5471154256275
15652 SUP001 CSC001 1 1 0.98143845517255 0.98143845517255
15652 SUP001 CSC002 1 1 0 10.2266625467779
15652 SUP001 CSC004 1 1 8.17258608040024 8.17566595300228
15652 SUP001 REC001 1 1 0.127205193350988 -3.13524397829688
...
In order to get this to work in the MERGE statement, you'll need to be a little more specific about the rows. I've adjusted the query below:
MERGE [__EpiTest] AS Tgt
USING (
SELECT [ActivityRecordID],
[ActCstID],
[ResCstID],
SUM([TotOCst]) AS TotOCst,
SUM([TotCst]) AS TotCst ,
VolAmt,
ActCnt
FROM [__EpiTest]
GROUP BY [ActivityRecordID],
[ActCstID],
[ResCstID],
VolAmt,
ActCnt) AS Src
ON (Tgt.[ActivityRecordID] = Src.[ActivityRecordID] AND
Tgt.[ActCstID] = Src.[ActCstID] AND
Tgt.[ResCstID] = Src.[ResCstID] AND
Tgt.TotOCst = Src.TotOCst AND
Tgt.TotCst = Src.TotCst
)
WHEN NOT MATCHED BY TARGET THEN
INSERT ( ActivityRecordID, ActCstID, ResCstID, TotOCst, TotCst, VolAmt, ActCnt )
VALUES ( ActivityRecordID, ActCstID, ResCstID, TotOCst, TotCst, VolAmt, ActCnt )
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
GO
Basically, I've changed it so that the source of the merge include the other columns that seem constant, so that they can be inserted later on. I've changed the MATCH
condition to be basically "the row is exactly the same and will be unchanged" and so removed the WHEN MATCHED statement.
I then added an INSERT statement, so that for ones that have changed (that will have their rows deleted by the NOT MATCHED BY TARGET), a new row is inserted with these values.
In your same data, this returns a result set of:
ActivityRecordID ActCstID ResCstID VolAmt ActCnt TotOCst TotCst
15652 DIM008 CPF005 30.455249786377 1 0 0.375024198767061
15652 DIM008 CSC004 30.455249786377 1 7.62176510799961 11.932578069479
15652 DIM008 REC001 30.455249786377 1 0.17902367836393 0.384881520159455
15652 OUT001 CPF002 15 0 0 16.9408193013078
15652 OUT001 CSC001 15 0 2.36971564207042 2.36971564207042
15652 OUT001 CSC004 15 0 12.3230666021278 12.3760690367354
15652 OUT001 REC001 15 0 0.377459387378349 3.0275278374102
15652 SUP001 CPF002 1 1 0 0.00108648359810756
15652 SUP001 CPF011 1 1 0 -1.89799880202357E-14
15652 SUP001 CPF020 1 1 0 1.31058251625567E-05
15652 SUP001 CSC002 1 1 0 10.2266625467779
15652 SUP001 REC002 1 1 0 1.10781732547441
15652 SUP001 CPF021 1 1 0 50.1884617025102
15652 SUP001 CSC001 1 1 3.9257538206902 3.9257538206902
15652 SUP001 CSC004 1 1 32.690344321601 32.7026638120092
15652 SUP001 REC001 1 1 0.508820773403952 -12.5409759131875
In the __EpiTest has duplicate values
SELECT * FROM @__EpiTest
where ResCstID = 'REC001' and ActCstID ='SUP001'
order by ActivityRecordID , ActCstID , ResCstID;
ActivityRecordID ActCstID ResCstID VolAmt ActCnt TotOCst TotCst
---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
15652 SUP001 REC001 1 1 0.2544103867 -6.2704879566
15652 SUP001 REC001 1 1 0.2544103867 -6.2704879566
To simple select the single values
SELECT DISTINCT * FROM @__EpiTest
where ResCstID = 'REC001' and ActCstID ='SUP001'
order by ActivityRecordID , ActCstID , ResCstID;
ActivityRecordID ActCstID ResCstID VolAmt ActCnt TotOCst TotCst
---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
15652 SUP001 REC001 1 1 0.2544103867 -6.2704879566
To delete
To delete duplicated values and maintain the frist
;WITH cte as (
SELECT Row_number() OVER (PARTITION BY ActivityRecordID , ActCstID , ResCstID ORDER BY (SELECT NULL)) Rn, * FROM @__EpiTest
where ResCstID = 'REC001' and ActCstID ='SUP001'
)
Delete from cte where Rn > 1
SELECT * FROM @__EpiTest
where ResCstID = 'REC001' and ActCstID ='SUP001'
ActivityRecordID ActCstID ResCstID VolAmt ActCnt TotOCst TotCst
---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
15652 SUP001 REC001 1 1 0.2544103867 -6.2704879566
Using the Merge - Be careful, merge statament has serious preformance problems
;WITH myResult as (
SELECT Row_number() OVER (PARTITION BY ActivityRecordID , ActCstID , ResCstID ORDER BY (SELECT NULL)) Rn, * FROM @__EpiTest
)
MERGE myResult AS Tgt
USING myResult AS Src
ON (Tgt.[ActivityRecordID] = Src.[ActivityRecordID] AND
Tgt.[ActCstID] = Src.[ActCstID] AND
Tgt.[ResCstID] = Src.[ResCstID] AND
Tgt.Rn = Src.Rn AND
Src.Rn = 1)
WHEN MATCHED THEN
UPDATE
SET [TotOCst] = Src.[TotOCst],
[TotCst] = Src.[TotCst]
WHEN NOT MATCHED BY SOURCE THEN
DELETE;
SELECT * FROM @__EpiTest
where ResCstID = 'REC001' and ActCstID ='SUP001'
Rn ActivityRecordID ActCstID ResCstID VolAmt ActCnt TotOCst TotCst
-------------------- ---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
1 15652 SUP001 REC001 1 1 0.2544103867 -6.2704879566
2 15652 SUP001 REC001 1 1 0.2544103867 -6.2704879566
(2 row(s) affected)
(20 row(s) affected)
ActivityRecordID ActCstID ResCstID VolAmt ActCnt TotOCst TotCst
---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
15652 SUP001 REC001 1 1 0.2544103867 -6.2704879566
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With