Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting duplicate rows from sqlite database

I have a huge table - 36 million rows - in SQLite3. In this very large table, there are two columns:

  • hash - text
  • d - real

Some of the rows are duplicates. That is, both hash and d have the same values. If two hashes are identical, then so are the values of d. However, two identical d's does not imply two identical hash'es.

I want to delete the duplicate rows. I don't have a primary key column.

What's the fastest way to do this?

like image 694
Patches Avatar asked Nov 18 '11 23:11

Patches


People also ask

How can I delete duplicate rows in database?

To delete the duplicate rows from the table in SQL Server, you follow these steps: Find duplicate rows using GROUP BY clause or ROW_NUMBER() function. Use DELETE statement to remove the duplicate rows.

How can we add delete and remove rows in SQLite database?

Introduction to SQLite DELETE statement In this syntax: First, specify the name of the table which you want to remove rows after the DELETE FROM keywords. Second, add a search condition in the WHERE clause to identify the rows to remove. The WHERE clause is an optional part of the DELETE statement.


1 Answers

You need a way to distinguish the rows. Based on your comment, you could use the special rowid column for that.

To delete duplicates by keeping the lowest rowid per (hash,d):

delete   from YourTable where    rowid not in          (          select  min(rowid)          from    YourTable          group by                  hash          ,       d          ) 
like image 130
Andomar Avatar answered Sep 17 '22 21:09

Andomar