I've got big (>Mil rows) MySQL database messed up by duplicates. I think it could be from 1/4 to 1/2 of the whole db filled with them. I need to get rid of them quick (i mean query execution time). Here's how it looks:
id (index) | text1 | text2 | text3
text1 & text2 combination should be unique, if there are any duplicates, only one combination with text3 NOT NULL should remain. Example:
1 | abc | def | NULL 2 | abc | def | ghi 3 | abc | def | jkl 4 | aaa | bbb | NULL 5 | aaa | bbb | NULL
...becomes:
1 | abc | def | ghi #(doesn't realy matter id:2 or id:3 survives) 2 | aaa | bbb | NULL #(if there's no NOT NULL text3, NULL will do)
New ids cold be anything, they do not depend on old table ids.
I've tried things like:
CREATE TABLE tmp SELECT text1, text2, text3 FROM my_tbl; GROUP BY text1, text2; DROP TABLE my_tbl; ALTER TABLE tmp RENAME TO my_tbl;
Or SELECT DISTINCT and other variations.
While they work on small databases, query execution time on mine is just huge (never got to the end, actually; > 20 min)
Is there any faster way to do that? Please help me solve this problem.
Eliminating Duplicates from a Query Resultmysql> SELECT DISTINCT last_name, first_name -> FROM person_tbl -> ORDER BY last_name; An alternative to the DISTINCT command is to add a GROUP BY clause that names the columns you are selecting.
I believe this will do it, using on duplicate key + ifnull():
create table tmp like yourtable; alter table tmp add unique (text1, text2); insert into tmp select * from yourtable on duplicate key update text3=ifnull(text3, values(text3)); rename table yourtable to deleteme, tmp to yourtable; drop table deleteme;
Should be much faster than anything that requires group by or distinct or a subquery, or even order by. This doesn't even require a filesort, which is going to kill performance on a large temporary table. Will still require a full scan over the original table, but there's no avoiding that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With