duplicate multi column entries postgresql

Question

I have a bunch of data in a postgresql database. I think that two keys should form a unique pair, so want to enforce that in the database. I try

create unique index key1_key2_idx on table(key1,key2)

but that fails, telling me that I have duplicate entries.

How do I find these duplicate entries so I can delete them?

bma · Accepted Answer

select key1,key2,count(*)
from table
group by key1,key2
having count(*) > 1
order by 3 desc;

The critical part of the query to determine the duplicates is having count(*) > 1.

There are a whole bunch of neat tricks at the following link, including some examples of removing duplicates: http://postgres.cz/wiki/PostgreSQL_SQL_Tricks

rocksfrow · Answer

Assuming you only want to delete the duplicates and keep the original, the accepted answer is inaccurate -- it'll delete your originals as well and only keep records that have one entry from the start. This works on 9.x:

SELECT * FROM tblname WHERE ctid IN 
(SELECT ctid FROM 
  (SELECT ctid, ROW_NUMBER() OVER 
    (partition BY col1, col2, col3 ORDER BY ctid) AS rnum 
  FROM tblname) t
WHERE t.rnum > 1);

https://wiki.postgresql.org/wiki/Deleting_duplicates

duplicate multi column entries postgresql

Tags:

duplicates

postgresql

multiple-columns

Greg Hennessy

2 Answers

bma

rocksfrow

Recent Activity

Donate For Us

duplicate multi column entries postgresql

Tags:

duplicates

postgresql

multiple-columns

Greg Hennessy

2 Answers

bma

rocksfrow

Related questions

Recent Activity

Donate For Us