I apologize for the confusing title, I can't figure out the proper wording for this question. Instead, I'll just give you the background info and the goal:
This is in a table where a person may or may not have multiple rows of data, and those rows may contain the same value for the activity_id
, or may not. Each row has an auto-incremented ID. The people do not have a unique identifier attached to their names, so we can only use first_name/last_name to identify a person.
I need to be able to find the people that have multiple rows in this table, but only the ones who have multiple rows that contain more than one different activity_id
.
Here's a sample of the data we're looking through:
unique_id | first_name | last_name | activity_id
---------------------------------------------------------------
1 | ted | stevens | 544
2 | ted | stevens | 544
3 | ted | stevens | 545
4 | ted | stevens | 546
5 | rachel | jameson | 633
6 | jennifer | tyler | 644
7 | jennifer | tyler | 655
8 | jennifer | tyler | 655
9 | jack | fillion | 544
10 | mallory | taylor | 633
11 | mallory | taylor | 633
From that small sample, here are the records I would want returned:
unique_id | first_name | last_name | activity_id
---------------------------------------------------------------
dontcare | ted | stevens | 544
dontcare | jennifer | tyler | 655
Note that which value of unique_id
gets returned is irrelvant, as long as it's one of the unique_id
s belonging to that person, and as long as only one record is returned for that person.
Can anyone figure out how to write a query like this? I don't care what version of SQL you use, I can probably translate it into Oracle if it's somehow different.
To select duplicate values, you need to create groups of rows with the same values and then select the groups with counts greater than one. You can achieve that by using GROUP BY and a HAVING clause.
I would do:
SELECT first_name, last_name, COUNT(DISTINCT activity_id)
FROM <table_name>
GROUP BY first_name, last_name
HAVING COUNT(DISTINCT activity_id) > 0;
I'll build through the logic with you. First, lets find all people that have more than one entry:
Unique list of name + activity ID:
select first_name, last_name,activity_id, count(1)
from yourtable
group by first_name, last_name,activity_id
Now we'll turn that into a subquery and look for users with more than 1 activity_ID
Select first_name, last_name
from
(select first_name, last_name,activity_id, count(1)
from yourtable
group by first_name, last_name,activity_id) a
group by first_name, last_name
having count(1) > 1
Should work as that...I didn't return an activity_id, adding max(activity_id) to the select statement will grab the highest one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With