It looks like we might have some duplicate values in our preference table so I want to delete all the duplicate entries for each user but the more recent one. I've spent quite a while trying to come up with a solution and I just keep running around in circles. I've tried max, grouped by, sub-queries, etc. It's pretty easy to delete the highest, but not everything but the highest. And especially for each user
So in the database I might have:
id : name: value : userId
1 : sortOrder: Asc : 1
2 : sortOrder: Desc : 1
3 : sortOrder: Asc : 2
4 : something: value2 : 1
So in this case, I want to remove the first row since it's duplicate for User 1 and has the highest id and keep all the rest of the columns.
So I know it's at least a sub-query with group by, but I just can't figure it out. So far I have:
SELECT
MAX(id),
name
FROM
preference
GROUP BY
name
which gives me a list of those I want to keep, BUT it still misses for each user. So I somewhat want the opposite if I had it for each user. However I don't believe I can use <> in a sub-query anyways.
What I want to ultimately do is say delete all those that have a lower ID for the same User that are duplicate entries.
One way, assuming Id
column is a unique Id:
DELETE FROM preference
where Id NOT IN (SELECT MAX(id) FROM preference GROUP BY name, userId)
[Note: The usual advice applies. don't run in production without testing first]
Test First with:
SELECT * FROM preference
where Id NOT IN (SELECT MAX(id) FROM preference GROUP BY name, userId)
SQLFiddle is here. Works as intended. I suggest testing with a bigger data set though...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With