I have unique <code>id</code> and <code>email</code> fields. Emails get duplicated. I only want to keep one Email address of all the duplicates but with the latest <code>id</code> (the last inserted record). How can I achieve this?

Imagine your table <code>test</code> contains the following data: <pre class="prettyprint"><code> select id, email from test; ID EMAIL ---------------------- -------------------- 1 aaa 2 bbb 3 ccc 4 bbb 5 ddd 6 eee 7 aaa 8 aaa 9 eee </code></pre> So, we need to find all repeated emails and delete all of them, but the latest id. In this case, <code>aaa</code>, <code>bbb</code> and <code>eee</code> are repeated, so we want to delete IDs 1, 7, 2 and 6. To accomplish this, first we need to find all the repeated emails: <pre class="prettyprint"><code> select email from test group by email having count(*) > 1; EMAIL -------------------- aaa bbb eee </code></pre> Then, from this dataset, we need to find the latest id for each one of these repeated emails: <pre class="prettyprint"><code> select max(id) as lastId, email from test where email in ( select email from test group by email having count(*) > 1 ) group by email; LASTID EMAIL ---------------------- -------------------- 8 aaa 4 bbb 9 eee </code></pre> Finally we can now delete all of these emails with an Id smaller than LASTID. So the solution is: <pre class="prettyprint"><code>delete test from test inner join ( select max(id) as lastId, email from test where email in ( select email from test group by email having count(*) > 1 ) group by email ) duplic on duplic.email = test.email where test.id < duplic.lastId; </code></pre> I don't have mySql installed on this machine right now, but should work <h3>Update</h3> The above delete works, but I found a more optimized version: <pre class="prettyprint"><code> delete test from test inner join ( select max(id) as lastId, email from test group by email having count(*) > 1) duplic on duplic.email = test.email where test.id < duplic.lastId; </code></pre> You can see that it deletes the oldest duplicates, i.e. 1, 7, 2, 6: <pre class="prettyprint"><code>select * from test; +----+-------+ | id | email | +----+-------+ | 3 | ccc | | 4 | bbb | | 5 | ddd | | 8 | aaa | | 9 | eee | +----+-------+ </code></pre> Another version, is the delete provived by Rene Limon <pre class="prettyprint"><code>delete from test where id not in ( select max(id) from test group by email) </code></pre>

Try this method <pre class="prettyprint"><code>DELETE t1 FROM test t1, test t2 WHERE t1.id > t2.id AND t1.email = t2.email </code></pre>

MySQL delete duplicate records but keep latest

2 Answers

Imagine your table test contains the following data:

  select id, email     from test;  ID                     EMAIL                 ---------------------- --------------------  1                      aaa                   2                      bbb                   3                      ccc                   4                      bbb                   5                      ddd                   6                      eee                   7                      aaa                   8                      aaa                   9                      eee

So, we need to find all repeated emails and delete all of them, but the latest id.
In this case, aaa, bbb and eee are repeated, so we want to delete IDs 1, 7, 2 and 6.

To accomplish this, first we need to find all the repeated emails:

      select email          from test        group by email       having count(*) > 1;  EMAIL                 --------------------  aaa                   bbb                   eee

Then, from this dataset, we need to find the latest id for each one of these repeated emails:

  select max(id) as lastId, email     from test    where email in (               select email                  from test                group by email               having count(*) > 1        )    group by email;  LASTID                 EMAIL                 ---------------------- --------------------  8                      aaa                   4                      bbb                   9                      eee

Finally we can now delete all of these emails with an Id smaller than LASTID. So the solution is:

delete test   from test  inner join (   select max(id) as lastId, email     from test    where email in (               select email                  from test                group by email               having count(*) > 1        )    group by email ) duplic on duplic.email = test.email  where test.id < duplic.lastId;

I don't have mySql installed on this machine right now, but should work

Update

The above delete works, but I found a more optimized version:

 delete test    from test   inner join (      select max(id) as lastId, email        from test       group by email      having count(*) > 1) duplic on duplic.email = test.email   where test.id < duplic.lastId;

You can see that it deletes the oldest duplicates, i.e. 1, 7, 2, 6:

select * from test; +----+-------+ | id | email | +----+-------+ |  3 | ccc   | |  4 | bbb   | |  5 | ddd   | |  8 | aaa   | |  9 | eee   | +----+-------+

Another version, is the delete provived by Rene Limon

delete from test  where id not in (     select max(id)       from test      group by email)

120

answered Sep 17 '22 17:09

Jose Rui Santos

Try this method

DELETE t1 FROM test t1, test t2  WHERE t1.id > t2.id AND t1.email = t2.email

answered Sep 17 '22 17:09

Pulkit Malhotra

Related questions
                            
                                Using version control (Git) on a MySQL database
                            
                                MongoDB vs MySQL
                            
                                One Mysql Table with Multiple TIMESTAMP Columns
                            
                                sql joins as venn diagram
                            
                                MySQL foreign key to allow NULL?
                            
                                Is a VARCHAR(20000) valid in MySQL?
                            
                                How can I employ "if exists" for creating or dropping an index in MySQL?
                            
                                Size for storing IPv4, IPv6 addresses as a string
                            
                                Alternative to Intersect in MySQL
                            
                                MySQL - Conditional Foreign Key Constraints
                            
                                How to count the number of instances of each foreign-key ID in a table?
                            
                                Getting timestamp using MySQL
                            
                                mysql datetime comparison
                            
                                "Ignoring query to other database" command line
                            
                                LOAD DATA LOCAL, How do I skip the first line?
                            
                                MySQL - Replace Character in Columns
                            
                                How can I store and retrieve images from a MySQL database using PHP?
                            
                                Continue SQL query even on errors in MySQL workbench
                            
                                ER_NOT_SUPPORTED_AUTH_MODE - MySQL server
                            
                                How do you write a case insensitive query for both MySQL and Postgres?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

MySQL delete duplicate records but keep latest

Tags:

mysql

duplicates

Khuram

People also ask

2 Answers

Update

Jose Rui Santos

Pulkit Malhotra

Recent Activity

Donate For Us