Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL query for non duplicate records

Tags:

sql

duplicates

I'm attempting to build a query that will return all non duplicate (unique) records in a table. The query will need to use multiple fields to determine if the records are duplicate.

For example, if a table has the following fields; PKID, ClientID, Name, AcctNo, OrderDate, Charge, I'd like to use the AcctNo, OrderDate and Charge fields to find unique records.

Table

PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
1        JX100        John     12345      9/9/2010      $100.00
2        JX220        Mark     55567      9/9/2010       $23.00
3        JX690        Matt     89899      9/9/2010      $218.00
4        JX100        John     12345      9/9/2010      $100.00

The result of the query would need to be:

PKID-----ClientID-----Name-----AcctNo-----OrderDate-----Charge
2        JX220        Mark     55567      9/9/2010       $23.00
3        JX690        Matt     89899      9/9/2010      $218.00

I've tried using SELECT DISTINCT, but that doesn't work because it keeps one of the duplicate records in the result. I've also tried using HAVING COUNT = 1, but that returns all records.

Thanks for the help.

like image 796
nth Avatar asked Sep 10 '10 15:09

nth


4 Answers

Use window function for count and then you don't have to aggregate fields

select * from
(SELECT *,
count(*) over (partition by CLIENTID) as [Count]
from Table)
where Count=1
like image 75
BowenC Avatar answered Nov 15 '22 16:11

BowenC


HAVING COUNT(*) = 1 will work if you only include the fields in the GROUP BY that you're using to find the unique records. (i.e. not PKID, but you can use MAX or MIN to return that since you'll only have one record per group in the results set.)

like image 36
heisenberg Avatar answered Nov 15 '22 18:11

heisenberg


You could just drop the PKID to return all records:

SELECT DISTINCT 
           ClientID
         , Name
         , AcctNo
         , OrderDate
         , Charge
FROM       table;

Note: This is slightly different from what you're asking.
It returns a unique set by removing the one non-unique field.
By your example, you're asking to return non-duplicates.

I could only see your example being useful if you're trying
to clean up a table by extracting the "good" records.

like image 21
vol7ron Avatar answered Nov 15 '22 18:11

vol7ron


SELECT   MAX(PKID)     AS PKID    ,
         MAX(ClientID) AS ClientID,
         MAX(Name)     AS Name    ,
         AcctNo                   ,
         OrderDate                ,
         Charge
FROM     T
GROUP BY AcctNo   ,
         OrderDate,
         Charge
HAVING   COUNT(*) = 1

or

SELECT PKID      ,
       ClientID  ,
       Name      ,
       AcctNo    ,
       OrderDate ,
       Charge
FROM   YourTable t1
WHERE  NOT EXISTS
       (SELECT *
       FROM    YourTable t2
       WHERE   t1.PKID     <> t2.PKID
       AND     t1.AcctNo    = t2.AcctNo
       AND     t1.OrderDate = t2.OrderDate
       AND     t1.Charge    = t2.Charge
       )
like image 26
Martin Smith Avatar answered Nov 15 '22 18:11

Martin Smith