I use to do
SELECT email, COUNT(email) AS occurences
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);
to find duplicates based on their email.
But now I'd need their ID to be able to define which one to remove exactly.
The second constraint is: I want only the LAST INSERTED duplicates.
So if there's 2 entries with [email protected] as an email and their IDs are respectively 40 and 12782 it would delete only the 12782 entry and keep the 40 one.
Any ideas on how I could do this? I've been mashing SQL for about a hour and can't seem to find exactly how to do this.
Thanks and have a nice day!
Well, you sort of answer your question. You seem to want max(id)
:
SELECT email, COUNT(email) AS occurences, max(id)
FROM wineries
GROUP BY email
HAVING (COUNT(email) > 1);
You can delete the others using the statement. Delete with join
has a tricky syntax where you have to list the table name first and then specify the from
clause with the join:
delete wineries
from wineries join
(select email, max(id) as maxid
from wineries
group by email
having count(*) > 1
) we
on we.email = wineries.email and
wineries.id < we.maxid;
Or writing this as an exists
clause:
delete from wineries
where exists (select 1
from (select email, max(id) as maxid
from wineries
group by email
) we
where we.email = wineries.email and wineries.id < we.maxid
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With