Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding duplicate values in different columns on the same row

Before someone goes on a rant that this table should be normalized, best practices, etc. I am going to admit that this is an old table we have in SQL Server 2008 R2 and I can't do anything about changing it. Having said that, this table has the following columns:

"PreparedBy", "PrelimApprovalBy", "Approval1Signer", "Approval2Signer" 

All these fields have either usernames or NULL or ''. I want to get all the rows where the same username appears in 2 OR MORE of the fields mentioned above. If 2 fields are NULL they are NOT a match and they are NOT a match if they are both ''. So both NULL and '' need to be excluded as they don't signify anything.


HERE'S WHAT I THOUGHT OF SO FAR BUT AM NOT LIKING IT:
I am thinking of checking all permutations in the WHERE clause (checking for NULL and '') by doing something along the lines of

WHERE PreparedBy =  PrelimApprovalBy OR PreparedBy = Approval1Signer OR ...

There has got to be a better way to do it.

like image 863
Denis Avatar asked Jul 09 '13 17:07

Denis


1 Answers

Here's one:

SELECT * FROM T
WHERE EXISTS 
     (SELECT 1 
      FROM (VALUES 
                   (PreparedBy)
                  ,(PrelimApprovalBy)
                  ,(Approval1Signer)
                  ,(Approval2Signer)) AS X (n)
      WHERE NULLIF(n, '') IS NOT NULL
      GROUP BY n
      HAVING COUNT(*)>1
     )

Basically, for each row, we're constructing a mini-table with the column values in different rows, and doing a GROUP BY and HAVING to check for groups of matching values. The NULLIF is helping us ignore '' values (making them NULL and then excluding all NULLs).

like image 89
GilM Avatar answered Sep 28 '22 17:09

GilM