Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server: How can I determine if a composite of several fields in a row are unique?

I am trying to do something like this in T-SQL:

SELECT ID 
FROM table
WHERE (ID, Firstname, Lastname) IN 
(SELECT ID, Firstname, Lastname
FROM table
GROUP BY ID, Firstname, Lastname
HAVING (COUNT(ID) > 1) AND (COUNT(Firstname) > 1) AND (COUNT(Lastname) > 1))

Basically if there is a row that has values of ID, Firstname, AND Lastname that are duplicated simultaneously in another row, I want to select it's ID. I understand that the way the having clause is set up, it wouldn't work like that, but bear with me. I want it to work Like this:

ID Firstname Lastname age
-------------------------
01 Bob       Smith    25
01 Bob       Smith    35
03 Bob       Smith    25
03 Mike      Smith    25
03 Bob       Baloney  25

In the above example only '01' is selected, since it matches the field below it on the three columns. '03' doesn't match on ID, firstname, or lastname precisely in any two columns so it doesn't get selected. Also, it doesn't matter if it matches on age, because I don't care about age.

If I added a column like this:

ID Firstname Lastname age
-------------------------
01 Bob       Smith    25
01 Bob       Smith    35
03 Bob       Smith    25
03 Mike      Smith    25
03 Bob       Baloney  25
03 Mike      Smith    32

THEN '03' would be selected as well, since it now matches on the three relevant fields with the record that is two rows above it.

I have seen the answers using derived tables to compensate for the lack of support of composite WHERE-IN clauses, but I'm not sure how to add the idea of uniqueness into the subquery.

Many thanks!

like image 960
Haris Khan Avatar asked Feb 02 '26 23:02

Haris Khan


1 Answers

You can use HAVING COUNT(*) > 1 to find all combinations with multiple entries:

SELECT DISTINCT ID 
FROM (SELECT ID,FirstName,LastName
      FROM table1
      GROUP BY ID,FirstName,LastName
      HAVING COUNT(*) > 1
      )sub

Demo: SQL Fiddle

Update: If you want to use the results of above to UPDATE the table, you can do it a number of ways, but it's likely easier to change the way you're identifying those multiples, by using COUNT() with an OVER() clause or the ROW_NUMBER() function, depending on how you want to update the data, you can then UPDATE a cte:

;WITH cte AS (SELECT *, COUNT(*) OVER(PARTITION BY ID,FirstName,LastName) AS CT
                      , ROW_NUMBER() OVER(PARTITION BY ID,FirstName,LastName ORDER BY ID) AS RN
              FROM Table1)
UPDATE cte
SET Somefield = 'SomeValue'
WHERE CT > 1  -- RN > 1

Observe the behavior of the two functions (added to the sql fiddle link above):

;WITH cte AS (SELECT *, COUNT(*) OVER(PARTITION BY ID,FirstName,LastName) AS CT
                      , ROW_NUMBER() OVER(PARTITION BY ID,FirstName,LastName ORDER BY ID) AS RN
              FROM Table2)
SELECT *
FROM cte  
like image 120
Hart CO Avatar answered Feb 05 '26 11:02

Hart CO