I have two tables that contains People who are working at the company and their Employment information (so People is one table, Employment is another). The People table contains information on where the person lives, emergency contact, phone number bla bla bla. The Employment table contains information on where he works, closest boss and more. These tables have been corrupted and now contains a few duplicates by misstake. Now in both tables there is a Person id, but the employment id is only located in Employment. I want both numbers on all people that have been duplicated.
This works perfectly:
SELECT DISTINCT
pp.Personid,
pp.Firstname,
pp.Lastname,
pp.Address,
FROM People pp
JOIN People pp2
ON pp.Firstname = pp2.Firstname
AND pp.Lastname = pp2.Lastname
AND pp.Address = pp2.Address
AND pp.Personid <> pp2.Personid
ORDER BY pp.Firstname, pp.Lastname, pp.Personid
returning the following values (but does not include Employment number as you can see):
1001 Carl Johnsson Bigstreet 1
1002 Carl Johnsson Bigstreet 1
1003 Carl Johnsson Bigstreet 1
1010 Andrew Wilkinsson Smallstreet 2
1011 Andrew Wilkinsson Smallstreet 2
Now, to add the employment id I join in that table like this:
SELECT DISTINCT
pp.Personid,
e.Employmentid,
pp.Firstname,
pp.Lastname,
pp.Address,
FROM People pp
JOIN People pp2
ON pp.Firstname = pp2.Firstname
AND pp.Lastname = pp2.Lastname
AND pp.Address = pp2.Address
AND pp.Personid <> pp2.Personid
JOIN Employment e on pp.Personid = e.Personid
ORDER BY pp.Firstname, pp.Lastname, pp.Personid
And everything goes to h**l in a handbasket with the following result:
1001 1111 Carl Johnsson Bigstreet 1
1001 1111 Carl Johnsson Bigstreet 1
1001 1111 Carl Johnsson Bigstreet 1
1010 1234 Andrew Wilkinsson Smallstreet 2
1010 1234 Andrew Wilkinsson Smallstreet 2
As you can see I get both Personid and Employmentid but now I only get one of each (repeated the correct number of times) so I don't have all the different Personid and Employmentid in my list.
Why?
What happened with my join that crashed the party?
Ok, let's make some sample data;
CREATE TABLE #People (PersonID int, FirstName varchar(50), LastName varchar(50), Address1 varchar(50))
INSERT INTO #People (PersonID, FirstName, LastName, Address1)
VALUES
('1','Mike','Hunt','Cockburn Crescent')
,('2','Mike','Hunt','Cockburn Crescent')
,('3','Mike','Hunt','Cockburn Crescent')
,('4','Connie','Lingus','Dyke Close')
,('5','Connie','Lingus','Dyke Close')
,('6','Eric','Shun','Tickle Avenue')
,('7','Ivana','Humpalot','Bottom Street')
CREATE TABLE #Employment (PersonID int, EmploymentID int)
INSERT INTO #Employment (PersonID, EmploymentID)
VALUES
('1','10')
,('2','11')
,('3','12')
,('4','13')
,('5','14')
,('6','15')
,('7','16')
I'd do the first query differently, if you work out the duplicates in a sub-select it would be easier, you'll then be able to join to the employment table with no problems;
SELECT pp.PersonID
,em.EmploymentID
,pp.FirstName
,pp.LastName
,pp.Address1
FROM #People pp
JOIN (
SELECT FirstName
,LastName
,Address1
,COUNT(1) records
FROM #People
GROUP BY FirstName
,LastName
,Address1
HAVING COUNT(1) > 1
) pp2 ON pp.FirstName = pp2.FirstName
AND pp.LastName = pp2.LastName
AND pp.Address1 = pp2.Address1
LEFT JOIN #Employment em ON pp.PersonID = em.PersonID
Remember to clean up the temp tables;
DROP TABLE #People
DROP TABLE #Employment
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With