Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Delete duplicate rows (don't delete all duplicate)

I am using postgres. I want to delete Duplicate rows. The condition is that , 1 copy from the set of duplicate rows would not be deleted.

i.e : if there are 5 duplicate records then 4 of them will be deleted.

like image 983
Avadhesh Avatar asked Sep 23 '10 11:09

Avadhesh


People also ask

Why are duplicates values not removing in Excel?

What is this? You then need to tell Excel if the data contains column headers in the first row. If this is checked, then the first row of data will be excluded when finding and removing duplicate values. You can then select which columns to use to determine duplicates.


1 Answers

Try the steps described in this article: Removing duplicates from a PostgreSQL database.

It describes a situation when you have to deal with huge amount of data which isn't possible to group by.

A simple solution would be this:

DELETE FROM foo
       WHERE id NOT IN (SELECT min(id) --or max(id)
                        FROM foo
                        GROUP BY hash)

Where hash is something that gets duplicated.

like image 157
Denis Valeev Avatar answered Sep 20 '22 16:09

Denis Valeev