I have a loooooong stored procedure that runs about 15 select statements across different tables in a database - I am inserting all the results of the selects into a temporary table. However - that table has some rows that are exact duplicates. The temporary table looks something like this:
DocID | VisitDate | DocName
8245376 | 8/26/2009 1:07:01 PM | Doc1
8245376 | 8/26/2009 1:07:01 PM | Doc1
8245376 | 8/26/2009 1:07:01 PM | Doc2
8245376 | 8/26/2009 1:07:01 PM | Doc2
646681 | 8/26/2009 1:07:01 PM | Doc3
263272 | 8/26/2009 1:07:01 PM | Doc4
8245376 | 8/26/2009 1:07:01 PM | Doc5
8245376 | 8/26/2009 1:07:01 PM | Doc5
8245376 | 8/26/2009 1:07:01 PM | Doc6
8245376 | 8/26/2009 1:07:01 PM | Doc6
1903875 | 8/26/2009 1:07:01 PM | Doc7
And how I'd like the table to look at the end is like this:
DocID | VisitDate | DocName
8245376 | 8/26/2009 1:07:01 PM | Doc1
8245376 | 8/26/2009 1:07:01 PM | Doc2
646681 | 8/26/2009 1:07:01 PM | Doc3
263272 | 8/26/2009 1:07:01 PM | Doc4
8245376 | 8/26/2009 1:07:01 PM | Doc5
8245376 | 8/26/2009 1:07:01 PM | Doc6
1903875 | 8/26/2009 1:07:01 PM | Doc7
How can I return only ONE row if there are multiple duplicate rows and still return rows that are not duplicates?
To select duplicate values, you need to create groups of rows with the same values and then select the groups with counts greater than one. You can achieve that by using GROUP BY and a HAVING clause.
The go to solution for removing duplicate rows from your result sets is to include the distinct keyword in your select statement. It tells the query engine to remove duplicates to produce a result set in which every row is unique.
SELECT DISTINCT DocID, VisitDate, DocName
FROM mytable
Or I am missing something?
Update:
If you have control over the stored procedure, you can rewrite it so that no duplicates will ever get into the table.
Assuming DocID
is a PRIMARY KEY
, declare it as such in the temporary table using IGNORE_DUP_KEY
:
DECLARE @temp TABLE (DocId INT NOT NULL PRIMARY KEY WITH (IGNORE_DUP_KEY = ON), …)
INSERT
INTO @mytable
SELECT …
FROM source_table
This will skip duplicates on DocID
If the duplicates are coming from the source tables, then do a SELECT DISTINCT when you are doing an insert into your temp tables.
If the duplicates are coming across tables, then just do a SELECT DISTINCT from your temp table after you have inserted all the rows.
Trying to actually delete rows from a table that are duplicate is a bit more involved, but doesn't seem to be necessary here, because you are working with a temp table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With