Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove duplicate from a table

Tags:

sql

postgresql

The database type is PostGres 8.3.

If I wrote:

SELECT field1, field2, field3, count(*) 
FROM table1
GROUP BY field1, field2, field3 having count(*) > 1;

I have some rows that have a count over 1. How can I take out the duplicate (I do still want 1 row for each of them instead of +1 row... I do not want to delete them all.)

Example:

1-2-3
1-2-3
1-2-3
2-3-4
4-5-6

Should become :

1-2-3
2-3-4
4-5-6

The only answer I found is there but I am wondering if I could do it without hash column.

Warning I do not have a PK with an unique number so I can't use the technique of min(...). The PK is the 3 fields.

like image 256
Patrick Desjardins Avatar asked Oct 28 '08 14:10

Patrick Desjardins


People also ask

How do you remove duplicates from a table?

If a table has duplicate rows, we can delete it by using the DELETE statement. In the case, we have a column, which is not the part of group used to evaluate the duplicate records in the table.

How do you remove duplicate records from a table in SQL?

To delete the duplicate rows from the table in SQL Server, you follow these steps: Find duplicate rows using GROUP BY clause or ROW_NUMBER() function. Use DELETE statement to remove the duplicate rows.

How do I remove duplicate records from the table with one copy?

DELETE Duplicate Records Using ROWCOUNT So to delete the duplicate record with SQL Server we can use the SET ROWCOUNT command to limit the number of rows affected by a query. By setting it to 1 we can just delete one of these rows in the table.

How can you eliminate duplicate records in a table with select query?

The SQL DISTINCT keyword, which we have already discussed is used in conjunction with the SELECT statement to eliminate all the duplicate records and by fetching only the unique records.


1 Answers

This is one of many reasons that all tables should have a primary key (not necessarily an ID number or IDENTITY, but a combination of one or more columns that uniquely identifies a row and which has its uniqueness enforced in the database).

Your best bet is something like this:

SELECT field1, field2, field3, count(*) 
INTO temp_table1
FROM table1
GROUP BY field1, field2, field3 having count(*) > 1

DELETE T1
FROM table1 T1
INNER JOIN (SELECT field1, field2, field3
      FROM table1
      GROUP BY field1, field2, field3 having count(*) > 1) SQ ON
            SQ.field1 = T1.field1 AND
            SQ.field2 = T1.field2 AND
            SQ.field3 = T1.field3

INSERT INTO table1 (field1, field2, field3)
SELECT field1, field2, field3
FROM temp_table1

DROP TABLE temp_table1
like image 138
Tom H Avatar answered Oct 01 '22 15:10

Tom H