We have a table of photos with the following columns:
id, merchant_id, url
this table contains duplicate values for the combination merchant_id, url
. so it's possible that one row appears more several times.
234 some_merchant http://www.some-image-url.com/abscde1213 235 some_merchant http://www.some-image-url.com/abscde1213 236 some_merchant http://www.some-image-url.com/abscde1213
What is the best way to delete those duplications? (I use PostgreSQL 9.2 and Rails 3.)
PostgreSQL will use this mode to insert each row's index entry. The access method must allow duplicate entries into the index, and report any potential duplicates by returning false from aminsert . For each row for which false is returned, a deferred recheck will be scheduled.
To select duplicate values, you need to create groups of rows with the same values and then select the groups with counts greater than one. You can achieve that by using GROUP BY and a HAVING clause.
Removing duplicate rows from a query result set in PostgreSQL can be done using the SELECT statement with the DISTINCT clause. It keeps one row for each group of duplicates. The DISTINCT clause can be used for a single column or for a list of columns.
One way to find duplicate records from the table is the GROUP BY statement. The GROUP BY statement in SQL is used to arrange identical data into groups with the help of some functions. i.e if a particular column has the same values in different rows then it will arrange these rows in a group.
Here is my take on it.
select * from ( SELECT id, ROW_NUMBER() OVER(PARTITION BY merchant_Id, url ORDER BY id asc) AS Row FROM Photos ) dups where dups.Row > 1
Feel free to play with the order by to tailor the records you want to delete to your specification.
SQL Fiddle => http://sqlfiddle.com/#!15/d6941/1/0
SQL Fiddle for Postgres 9.2 is no longer supported; updating SQL Fiddle to postgres 9.3
The second part of sgeddes's answer doesn't work on Postgres (the fiddle uses MySQL). Here is an updated version of his answer using Postgres: http://sqlfiddle.com/#!12/6b1a7/1
DELETE FROM Photos AS P1 USING Photos AS P2 WHERE P1.id > P2.id AND P1.merchant_id = P2.merchant_id AND P1.url = P2.url;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With