I have a table with the following fields:
id (Unique) url (Unique) title company site_id
Now, I need to remove rows having same title, company and site_id
. One way to do it will be using the following SQL along with a script (PHP
):
SELECT title, site_id, location, id, count( * ) FROM jobs GROUP BY site_id, company, title, location HAVING count( * ) >1
After running this query, I can remove duplicates using a server side script.
But, I want to know if this can be done only using SQL query.
The go to solution for removing duplicate rows from your result sets is to include the distinct keyword in your select statement. It tells the query engine to remove duplicates to produce a result set in which every row is unique.
Note − Use the INSERT IGNORE command rather than the INSERT command. If a record doesn't duplicate an existing record, then MySQL inserts it as usual. If the record is a duplicate, then the IGNORE keyword tells MySQL to discard it silently without generating an error.
You can use DELETE command with some condition for this since we need to keep one record and delete rest of the duplicate records. The above query deleted 2 rows for “Carol” and left one of the “Carol” record.
A really easy way to do this is to add a UNIQUE
index on the 3 columns. When you write the ALTER
statement, include the IGNORE
keyword. Like so:
ALTER IGNORE TABLE jobs ADD UNIQUE INDEX idx_name (site_id, title, company);
This will drop all the duplicate rows. As an added benefit, future INSERTs
that are duplicates will error out. As always, you may want to take a backup before running something like this...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With