I have a bunch of data in a postgresql database. I think that two keys should form a unique pair, so want to enforce that in the database. I try
create unique index key1_key2_idx on table(key1,key2)
but that fails, telling me that I have duplicate entries.
How do I find these duplicate entries so I can delete them?
select key1,key2,count(*)
from table
group by key1,key2
having count(*) > 1
order by 3 desc;
The critical part of the query to determine the duplicates is having count(*) > 1
.
There are a whole bunch of neat tricks at the following link, including some examples of removing duplicates: http://postgres.cz/wiki/PostgreSQL_SQL_Tricks
Assuming you only want to delete the duplicates and keep the original, the accepted answer is inaccurate -- it'll delete your originals as well and only keep records that have one entry from the start. This works on 9.x:
SELECT * FROM tblname WHERE ctid IN
(SELECT ctid FROM
(SELECT ctid, ROW_NUMBER() OVER
(partition BY col1, col2, col3 ORDER BY ctid) AS rnum
FROM tblname) t
WHERE t.rnum > 1);
https://wiki.postgresql.org/wiki/Deleting_duplicates
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With